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Compiled By: Syeda Beenish 


Open source to be an essential 
part of university curriculum 
worldwide 


Red Hat has established a partnership with more 
than 100 colleges and universities all over the globe 
to provide enhanced training in open source. As 
part of the partnership, Red Hat will also provide 
certifications in upcoming technologies like DevOps, 
cloud IT automation and many others. 

To name a few, Maharaja Ranjit Singh Punjab 
Technical University and Amity University 
have recently included a systems administration 
curriculum in their updated engineering syllabus. In line with recent developments 
in the field of open source in India, Red Hat India has recently signed a 
Memorandum of Understanding with Telangana Academy for Skills and 
Knowledge (TASK), the Department of ITE&C, government of Telangana. 

Red Hat Academy’s core curriculum comprises systems administration 
and systems engineering on Red Hat Enterprise Linux, besides Red Hat JBoss 
middleware administration. For cloud IT automation, the curriculum includes 
introductory Red Hat OpenStack (cloud) content and technical training to the 
associated faculty members. 

The courses are global and have been incorporated in colleges at 35 
locations across North America and 85 locations worldwide. The instructors 
are selected based on their Red Hat certifications, and their knowledge of the 
course’s curriculum. 


redhat 


LEDE and OpenWRAT promise a joint release soon 

It all started in March 2016, when a group of developers, unhappy with where 
OpenWRT was heading, created LEDE 

or the Linux Embedded Development 
Environment. The group created the 
alternative LEDE since the concern was that 
OpenWRT lacked a process to bring new 
core developers into the project at a time 
developer numbers were dwindling. 

At the end of 2016, the two groups O 0) ce nwint 
started discussing merging again, and in Wireless Freedom 
May 2017, they reached an agreement on the terms of a merger. The OpenWRT 
and LEDE open source router projects have now merged and a major release is 
due in the coming months. 

OpenWRT has been known as a serious open source codebase for router 
firmware. It allows users to overwrite vendor firmware, either for security reasons 
or to conduct their own low level development. 

At the time of the split, LEDE was led by Jo-Philipp Wich, John Crispin, 
Daniel Golle, Felix Fietkau, Hauke Mehrtens, Matthias Schiffer and Steven Barth. 

The OpenWRT - LEDE merged open source router project has now gone live. 
The announcement says the project will be governed under the rules of the LEDE 
project, and that the focus will be on small, frequent minor releases, as well as 
stability and release maintenance. 


FB strengthens its 
commitment for open 
source with Docusaurus 
As part of its commitment to the 
open source community, Facebook 
has unveiled Docusaurus, which 
is an open source toolkit that 
lets users and teams publish 
documentation websites without 
having to worry about the 
infrastructure and design details. 

Sources claim that Docusaurus 
is an easy-to-maintain open source 
documentation tool that uses 
Markdown to help users write 
docs and blog posts. The primary 
function of Docusaurus is to 
publish a set of static HTML files 
that are ready to upload onto a 
server for public viewing. 

Banking upon open source, 
Docusautrus enables developers to 
customise their 
project’s layout, 
reusing the 
same header and 
footer. For those 
not that familiar 
with the technology, this feature 
has many developers praising 
it. “The easiest React static site 
generators. Unlike Gatsby and 
Next, there is no need to fiddle with 
any real code/complicated API just 
to get a Markdown blog working,” 
says one user. 

Docusaurus comes with a 
couple of other very useful perks 
as well. There’s localisation that 
comes preconfigured with support 
for over 70 languages. There’s also 
versioning support, and document 
search, so that you can easily 
trace the document you wish to 
update or consult. 

This is not the first time 
Facebook has released toolkits 
and products to the open source 
community for further development 
and broader adoption. 
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Open source Lightroom 
alternative available for 
Windows users 

Those who are looking for a cheaper 
alternative to Lightroom to manage 
and process raw photos, the free and 
open source software, darktable, is now 
finally available for Windows users. 

Photographers have been using 
darktable on Mac and Linux based 
systems since its release in 2009. 
Now, after eight years, darktable has 
been officially ported to Windows in 
its new version 2.4.0. It is a software 
that contains a set of photo editing 
features that help you do non- 
destructive post-processing on raw 
files, especially in large batches. 

For quite some time, 
photographers have been requesting 
a Windows build. The project leaders 
had refrained from venturing into 
Windows due to a lack of people who 
could be dedicated to maintaining it 
in the long term. Recently, developer 
Peter Budai brought new hope for 
Windows users. 


In its announcement, sources at 
darktable shared their intent to support 
it in the future too. The release also 
highlighted a few missing features 
such as the lack of printing support 
and the need for installing special 
drivers for tethering. The new version 
comes with its own set of bugs (TIFF 
import and export don’t support non- 
ASCII characters in file names) too. 

In addition to a Windows 
build, darktable version 2.4.0 has 
a host of improvements including 
anew module for haze removal, 
undo support for masks, intelligent 
grouping of undo steps, and more. 


Support has ended for OpenWRT releases prior to 15.05 (which means no 
security or bug fixes), and the project team has warned that OpenWRT CC 15.05 
patches will run behind time for a while since it’s “...not yet fully integrated into 
our release automation.” 

“The LEDE 17.01 release will continue to get full security and bug fix support 
for both source code and binary releases,” the project team announced. 


Capsule8 launches open source sensor capable 
of detecting meltdown 
Capsule8 has created the first practical strategies for detecting meltdowns of 
Linux based systems and is now making these 
available to the public. 
It has unveiled the beta version of the Capsule8 
4? open source attack detection sensor. The new sensor is 
used as part of the Capsule8 Protect platform, and will 
facilitate real-time detection of Linux based attacks. 

Next, the company has announced providing open source proof-of-concept 
code for the fast and efficient detection of the Intel Meltdown vulnerability, 
with minimal false positives. 

“Remediation works but it’s painful in terms of the time and resources required. 
The necessary upgrades lead to huge cost and stability risks,” said Dino Dai Zovi, 
co-founder and CTO of Capsules. 

The Capsule8 open source sensor is built to support an efficient gathering 
of system level telemetry, much like the commonly used auditd, but is built for 
performance under load. Currently, Capsule8’s Protect platform is in beta mode. It 
uses the sensor to do real-time attack disruption, enabling people to detect zero-day 
attacks and respond to them in real-time. 

Anyone using the Capsule8 open source attack detection sensor can build their 
own attack strategies. As an example, the company has provided a strategy for 
detection of the recent Meltdown vulnerability under an Apache licence. 

The sensor works for any out-of-the-box version of Linux, dating back to the 
Linux 2.6 kernel. 

“Now, organisations can specifically detect attempts to exploit these problems, 
giving them the ability to monitor for the problem and respond in real-time, 
up until they’re able to remediate appropriately,” said John Viega, co-founder 
and CEO of Capsule8. 


‘Open Source Initiative’ turns 20 this February! 
The term ‘open source’ was coined at a ‘strategy session’ held on February 3, 1998, 
in Palo Alto, California. In the same month, the ‘Open 
Source Initiative’ was founded. It was created as a general 
educational and advocacy organisation to raise awareness 
about the superiority of an open development process, 
and hence drive its adoption. 
® The Open Source Initiative (OSI) has been raising 
open SOUFCE awareness and promoting the adoption of open 
i nitiative source software since it was founded. In 2018, it is 
celebrating the 20th anniversary of open source. This 
is a huge milestone for everyone involved with technology. Nowadays, open 
source is ubiquitous, recognised across industries as a fundamental component to 
infrastructure, as well as a critical factor for driving innovation. 
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Kubeflow takes ML to 
Kubernetes 

The newly announced project from 
Google engineers, called Kubeflow, 
aims to leverage machine learning 

to address the hurdles of launching 
convoluted workloads on Kubernetes, 
which is an open source platform that 
serves as the backbone of container 
orchestration management. With the 
arrival of Kubeflow, Kubermetes will 
be able to use machine learning (ML) 
stacks anywhere. 

Specifically, Kubeflow includes the 
JupyterHub platform, which enables 
data science and research groups to 
create and manage Jupyter notebook 
servers. Additionally, Kubeflow 
includes a TensorFlow Customer 
Resource, which can support either 
CPUs or GPUs, and be tailored to 

manage 
a specific 
container 
cluster size. 

Ina 
company blog post, Philip Winder, an 
engineer and consultant at Container 
Solutions, wrote, “Like DevOps has 
merged operations and development, 
DataDevOps will consume data science.” 

The company also shared that 
it believes working with multiple 
environments, from development to 
production, will become the norm for 
most Kubeflow users. Consequently, 
Kubernetes is making use of the 
Ksonnet project, which is intended to 
make it easier to transfer workloads 
across multiple environments. 

Kubernetes is currently working 
to cultivate a community around 
the project. Among the companies 
collaborating on the project are 
CaiCloud, Red Hat and OpenShift, 
Canonical, Weaveworks, Container 
Solutions, etc. The project was much 
needed to make it easier to set up 
and productionise machine learning 
workloads on Kubernetes. 


To mark the anniversary, OSI is organising several activities through the year. 
Opensource.net was launched for this purpose. Nick Vidal from the OSI informed 
that the successful completion of two decades will witness worldwide celebrations 
in conjunction with major technology conferences such as FOSDEM, OSCON, 
Open Source Summit, FOSSAsia, Campus Party, Linux.conf.au, etc. 


Baidu’s Apollo Autonomous driving platform chooses ON 
Semiconductor’s image sensors 
Image sensing is a key component of the Apollo platform, which supports enhanced 
autonomous driving. Post collaboration with ON Semiconductor, the partners of 
Baidu’s Apollo Autonomous driving platform 
a | can get the jointly developed plug-and-play 
compatible imaging solutions. 
Qcollo Apollo provides an open source, reliable 
software and hardware system, enabling the 
efficient development of autonomous driving systems by automotive systems 
designers. The collaboration will simplify the implementation of image sensing 
solutions for automotive manufacturers and suppliers, while enhancing the speed. 

ON Semiconductor’s fully-qualified 3mm-based advanced CMOS image sensors 
provide flexibility to move to future sensors at volume deployment. It also allows 
customers to begin development of vision systems for autonomous driving. With 
high dynamic range (HDR), the sensor is able to provide crisp, clear single images 
as well as video in the challenging low and mixed light scenes typical of automotive 
environments. 

In addition to the simplified and accelerated design, and test and implementation 
of automotive camera applications, the relationship will offer all Apollo partners 
the early opportunities to work with future generation, breakthrough image sensing 
technologies from ON Semiconductor. 

Ross Jatou, VP and GM of the automotive solutions group at ON Semiconductor 
said, “We believe that the value of such a platform to automotive system designers 
will be tremendous. Image sensors are fundamental components of ADAS 
implementations throughout the vehicle, and they will become even more relevant 
as the industry moves towards fully autonomous cars. Joining forces with Baidu by 
providing the image sensor solution for the Apollo platform is further validation of 
ON Semiconductor’s leading position in automotive image sensing.” 


‘Indian Open Source Project Incubator’ by Paytm 

Paytm has launched a virtual incubator to promote its ‘Build for India’ initiative. 
Now professionals and students can build open source based solutions, and share 
these projects with the global developer community, thanks 
to the recent announcement of the ‘Indian Open Source 
Project Incubator’ from Paytm. 

According to Vijay Shekhar Sharma, founder and CEO, 
Paytm, the virtual incubator will aim to find developers 
who can use technology and design to build world class 
solutions. The company shared that as part of the incubator 
programme, Paytm will make its premises available for mentoring, meet-ups, as 
well as technology and design support, from time to time. 

The launch is in line with the company’s ‘Build for India’ initiative to promote 
a ‘building and sharing’ culture among developers in the country. It will focus 
on open source projects in the field of education, digital lifestyle, and financial 
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inclusion, including new age technologies like machine learning, augmented reality 
and virtual reality. 

‘Indian Open Source Project Incubator’ will also encourage students to contribute 
to open source projects. Paytm plans to conduct hackathons, bootcamps, meet-ups, 
and monitoring sessions while partnering with around 20 institutions. Outstanding 
external contributors to the open source project will be awarded ‘Paytm Scholar’ 
certificates as well as the opportunity to join the company. 


‘World in Conflict’ servers’ code gets open sourced 

In a major move, Ubisoft has released an open source version of Massgate, the 
central server that once powered the online functionality for Massive Entertainment’s 
real-time strategy title, World in Conflict. 

With open source code nestled snugly in a GitHub repository, players can now 
host their own World in Conflict servers. And, as Ubisoft points out, the code itself 
offers an educational snapshot of how online 
servers were crafted over a decade ago. 

The company says it was inspired to open 
Massgate up to the public after watching 
community-led efforts to bring the game’s 
multi-player functions back to life. According 
to Ubisoft, the code is, more or less, the same as 
it was when the game was launched in 2007, save for a few tweaks made to make it 
compatible with modern compilers. 

World in Conflict marks the second notable game to have missing features 
restored recently through the release of open source code. Riot-owned Radiant 
Entertainment also recently announced that it would be releasing a build of its 
cancelled fighting game Rising Thunder, complete with an open source version of the 
source code for its online servers. 


LinkedIn open sources tools to combat website navigation issues 
Lead developer Steven Callister in a blog post shared that LinkedIn has open sourced 
two new tools to assist engineers in automating the investigation of broken hosts and 
services: Fossor and Ascii Etch. Fossor (the Latin word for grave digger) is a Python 
tool, while Ascii Etch, another Python library, outputs information gleaned from 
Fossor in Ascii-character graphs. 

Callister wrote in a blog post, “Having experienced the pain of performing the 
same repetitive steps again and again during my own on-call shifts, I concluded 
that writing a tool to perform some of these basic checks in parallel would speed up 
the mean time to resolution. Taking the idea even further, I wanted a tool that could 

perform checks tailored specifically to my 

e services while still having the flexibility to 
Li n ked incorporate newly-developed checks in the 
future. Fossor was created to do just that.” 

Fossor’s design splits the two components of the program, the engine and plugins, 
to reduce the incidence of serious bugs, whereas Ascii Etch was originally created to 
draw the results from running Fossor. Callister wrote that it proved more helpful than 
simple text for quickly spotting anomalies in the data. 

Callister also said that the development team hopes that the specificity-through- 
modularity of Fossor will greatly benefit site administrators and the open source 
community while it contributes more plugins to the automation tool. Now that Ascii 
Etch is open sourced, it also supports vertical value scaling, as well as horizontal 
value compression. 


FOSSBYTES 


Amazon’s cloud business 
takes a leap with Linux 2 
Amazon has started offering its 
enterprise customers a new version 
of the Linux operating system -- 
Linux 2, which runs from Amazon’s 
data centres. Thus, it replaces the 
cloud computing juggernaut’s 
software that needs to be installed 
on customers’ own servers. 


While Amazon is making the 
software available for companies to 
install on their own servers, it is also 
renting access to Linux 2 to its cloud 
customers. There, it can be used to 
run many of the most popular server 
software programs and technologies, 
including Microsoft’s Hyper-V, 
VMware, Oracle’s VM VirtualBox, 
Docker and Amazon’s Docker 
alternative, Amazon Machine Image. 

Linux 2 carries five years’ 
support that promises security 
patches and bug fixes, just like other 
software vendors offer with their 
wares. Needless to mention, it is 
designed to work well with open 
source databases, programming 
languages and other popular 
applications. 

The move is significant, 
because now Amazon competes 
with companies that provide server 
hardware and software. With Linux 
2, instead of buying it, enterprises 
can now rent it all from the cloud 
and pay fees on a per-use basis. 
Linux 2 is the second version of 
Linux Amazon. Customers could 
run the previous version on their 
own servers also, but in a much 
more limited way. 
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Windows gets Kubernetes 
1.9 beta support 

The arrival of Kubernetes 1.9 is 
nicely timed as it has become part 
of the cloud container orchestration 
programme. The container 
management programme is available 
on all serious cloud platforms since 
Amazon Web Services (AWS) 
adopted Kubernetes. 


kubernetes 
1.9 


Kubernetes was originally 
developed for Linux systems, but 
there’s been a demand for Kubernetes 
to run Windows workloads as 
well. This is in addition to making 
Kubernetes available for Linux on 
Azure. 

In October 2017, Microsoft 
announced a dedicated Azure 
Container Service for Kubernetes. 
Running Windows apps on Docker 
on Windows Server, while managing 
them with Kubernetes, is now in 
beta. Apps Workloads’ application 
programming interface (API) is now 
stable and available for general use. 

The two most commonly used 
Deployment and ReplicaSet form 
the foundation for long-running 
Kubernetes stateless workloads, and 
are stable now. 

From its start, Kubernetes has 
supported multiple options for 
persistent data storage, including 
commonly used NFS or iSCSI. 
Kubernetes developers are addressing 
this by adopting Container Storage 
Interface (CSI). Kubernetes 1.9 is 
available for download on GitHub. 


Red Hat OpenShift Application Runtimes empowers 
cloud-native development 

Red Hat has announced the general availability of Red Hat OpenShift Application 
Runtimes. The announcement aims to enable organisations to 
accelerate cloud-native app development with a curated set of 
frameworks and runtimes for prescriptively building and running 
microservices-based applications. 

Red Hat OpenShift Application Runtimes leverages the 
company’s decade-plus of experience with Red Hat JBoss Middleware to build from 
the ground up a solution for the next generation of microservices-based application 
development. Red Hat aims to balance the developers’ need for choice with the 
operational requirement for standardisation and support. 

It provides a tightly integrated and fully supported offering for developing 
microservices in multiple languages and frameworks. Certified and supported 
runtimes available with Red Hat OpenShift Application Runtimes include Java EE, 
WildFly Swarm, Eclipse MicroProfile, Eclipse Vert.x, Node.js and Spring Boot. 

The key features and benefits include simplified development and strategic 
flexibility. Combined with the OpenShift service catalogue, enterprise IT 
organisations can take full advantage of multi-cloud investments by integrating cloud 
based services. Due to the integration of Red Hat OpenShift Container Platform with 
Red Hat OpenShift Application Runtimes, developers get access to a fully automated 
platform for provisioning, building and deploying applications and their components. 


OPENSHIFT 


reel raises US$ 3 million for its open source 
distributed graph database 

Dgraph, an increasingly popular open source distributed graph database that uses a 
version of Facebook’s GraphQL as its default 
query language, has released version 1.0. 

Dgraph has announced that it has raised US$ 
3 million in funding from Bain Capital Ventures, 
Atlassian co-founder Mike Cannon-Brookes, 
Blackbird Ventures and AirTree (this includes a 
US$ 1.1 million seed round the company raised 
last year). The company also announced that its flagship open source distributed 
graph database has hit the version 1.0 stage. 

Dgraph founder Manish Jain shared that finding funding for this project wasn’t 
easy, but a chance meeting got him in front of Bain, which had already been pitched 
by competing companies. It also didn’t help that Jain was living in Australia at the 
time and that he founded the company there. Now, with a green card in hand, he’s 
moving himself and the company to the US. 

The company believes that Dgraph’s edge over competitors like Neo4 and others 
is the fact that it was built as a distributed database from the ground up. 

The Dgraph project started in late 2015, and even though it didn’t hit version 1.0 
until now, it’s already being used in production by quite a few developers. Current users 
include gaming services, as well as advertising and financial technology companies that, 
among other things, use it as part of their fraud detection platforms. Other use cases 
include the likes of search engines, IoT, medical research, machine learning and AI. 


For more news, visit www.opensourceforu.com 
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Check The 
“GABBAR” OFFER 
Tum to Page No. 20-211 for details or call: +91-9582568168 


Sandya Mannarswamy 


In this month’s column, we continue our discussion on detecting duplicate 
questions in community question-answering forums. 


out on in last month’s column, in which we 

discussed the problem of detecting duplicate 
questions in community question answering (CQA) 
forums using Quora’s question pair data set. 

Given a pair of questions <Q1, Q2>, the task 
is to identify whether Q2 is a duplicate of Q1. Our 
system for duplicate detection first needs to create 
a representation for each input sentence, and then 
feed the representations for each of the two questions 
to a classifier which will decide whether they are 
duplicates or not by comparing the representations. 

In this month’s column, I will provide some of the 
skeleton code functions which can help to implement 
this solution. I have deliberately not provided the 
complete code for the problem, as I would like readers 
to build their own solutions and become familiar with 
creating simple neural network models from scratch. 

As discussed in last month’s column, while there are 
multiple methods of creating a sentence representation, 
we can simply use a concatenation of word embeddings 
of the individual words in the question sentence to 
create a question embedding representation. We can 
either learn the word embeddings specific for the task 
of duplicate question-detection (provided our corpus 
is large and general enough), or we can use pre-trained 
word embeddings such as Word2vec. 

In our example, we will use Word2vec embeddings 
and concatenate the embeddings of individual words 
to represent the question sentence. Assuming that we 
use Word2vec embeddings of dimension D, each input 
question can be represented by (NXD), where N is the 
number of words in the question and D is the embedding 
size. We need to feed in the input representation for 
each of the two questions we are comparing to the neural 
network model. 

Before we look at the actual code for the neural 
network model, we need to decide which deep learning 


| et’s continue exploring the topic we started 
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framework we would use to implement this model. 
There are a number of choices such as Theano, MxNet, 
Keras, CNTK, Torch, PyTorch, Caffe, Tensorflow, 

etc. A brief comparison of some of the popular deep 
learning frameworks is covered in the article at https:// 
deeplearning4j.org/compare-dl4j-tensorflow-pytorch. 

In selecting a deep learning framework for 
a project, one needs to consider both ease of 
programming, maintainability of code and long-term 
support for the framework. Some of the frameworks, 
such as Theano, are from academic groups; hence 
their support may be time-limited. For this project, we 
decided to go with TensorFlow, given its widespread 
adoption in the industry (it is sponsored by Google) 
and its ease of use. We will assume that our readers 
are familiar with TensorFlow (a quick introduction 
to it can be found at https:/Awww.tensorflow.org/get_ 
started/get_started). 

Let us assume that we have a binary file which 
contains the word to embedding mapping for the 
Word2vec embeddings (pretrained word embeddings 
are available from either https://code.google.com/ 
archive/p/word2vec/ if you want to use the Word2vec 
model or https://nlp.stanford.edu/projects/glove/ if you 
want to use the Glove vectors). 

First, let’s read our training corpus and build 
a vocabulary list that contains all the words in our 
training corpus. Next, let’s build a map which maps 
each word to a valid Word2vec embedding. Shown 
below is the skeleton code for this: 


def create_word_map(w2v_file, vocab_list): 
data = np.load(w2v_file) 
glove_array = data[‘glove’] 
for word in vocab_list: 
idx = vocab_list.index(word) 
w2v_wordmap[word] = glove_array[idx] 


Next, we will create the sentence matrix associated with 
each question, as follows: 


def question2sequence(ques): 
tokens = get_tokens(ques) 
rows = [] #represents the sentence embedding matrix 
#Greedy search for tokens 
for token in tokens: 
assert (token in w2v_wordmap), 
found in w2v_map” 
rows .append(w2v_wordmap[token] ) 


word + “not 


if len(tokens) < MAX_QN_LEN: 
#question is too short 
#we need to pad the question up to max sequence 
length 

j = MAX_QN_LEN - len(tokens) 

word = _UNK 

while j > 0: 
rows .append(w2v_wordmap[word] ) 
j=j-1 

return rows 


Now, let’s convert this sentence matrix into a fixed-length 
vector representation of the question. Given a sequence of 
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input words (this constitutes the question sentence), we 

now pass this sequence through a recurrent neural network 
(RNN) and create an output sequence. We can use either 
vanilla RNNs, gated recurrent units (GRU) or long short 
term memory (LSTM) units for creating a fixed-length 
representation from a given input sequence. Given that 
LSTMs have been quite successfully used in many of the 
NLP tasks, we decided to use them to create the fixed-length 
representation of the question. 


Import tensorflow as tf 

question1 = tf.placeholder(tf.float32, [N, 1_q1, D], 
“question1’ ) 

1stm_cell = tf.contrib.rnn.BasicLSTMCell(num_lstm_units) 
value, state = tf.nn.dynamic_rnn(1stm_cell, question1, 
dtype=tf . float32) 


While RNN generates an output for each input in the 
sequence, we are only interested in the final aggregated 
representation of the input sequence. Hence, we take the 
output of the LSTM at the last time step and use it as 
our sentence representation. Note that the last time step 
corresponds to the last word in the sentence being fed 
to the LSTM. Hence, the LSTM output corresponds to 
an aggregated representation of the current word and all 


Mt ; 
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the words that come before it. Hence, it represents the 
complete sentence. 


valuel = tf.transpose(value, [1, 0, 2]) 
Istm_output = tf.gather(value1, int(value1.get_shape()[0]) 
2a) 


Just as we obtained a fixed-length representation for 
Question 1, we also created a fixed-length representation 
for Question 2 using a second LSTM. The last stage output 
from each of the two LSTMs (one LSTM for each of the 
two questions) represents the input question representation. 
We can then concatenate these two representations and feed 
it to the multilayer perceptron classifier. 

An MLP classifier is nothing but a fully connected multilayer 
feed forward neural network. Given that we have a two-class 
prediction problem, the last stage of the MLP classifier is a 
two-unit softmax, whose output gives the probabilities for each 
of the two output classes. Here is the skeletal code for an MLP 
classifier with three densely connected feed forward layers, with 
256, 128 and two units each: 


predict_layer_one_out = tf.layers.dense(1stm_output, 
units=256, 
activation=tf.nn.relu, 


name="prediction_layer_one”) 
predict_layer_two_out = tf.layers.dense(dropout_predict_ 
layer_one_out, 


units=128, 
activation=tf.nn.relu, 


name="prediction_layer_two”) 

predict_layer_logits = tf.layers.dense(predict_layer_two_out, 

units=2, name="final_output_layer”) 
Here are a couple of questions for our readers 

to think about: 

= How do you decide on the number of hidden layers and 
the number of units in each hidden layer for your MLP 
classifier? 

= In our problem, we are doing binary classification as we 
need to predict whether a question is duplicate or not. How 
would this code change if you had to predict one out of K 
different classes (for example, if you are trying to predict 
which category a particular question may belong to)? 
The last layer output is used as input to a TensorFlow 

network loss computation node, which computes the cross- 

entropy loss using the ground truth labels as shown below: 


loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits( 
logits=predict_ 

layer_logits, labels=labels)) 

optimizer = tf.train.AdamOptimizer().minimize(loss) 


The network is then trained with ground truth labels 
during the training phase to select network weights such that 
cross-entropy loss is minimised. 

We also need to add code that can compute the accuracy 
during the training phase. 

As we had discussed in earlier columns on neural networks, 
gradient descent techniques are typically used to learn the 
network parameters/weights. Typically, batch gradient descent 
is used for weight updates during the training process. Here are 
a couple of questions for our readers: 
= Why do we prefer batch gradient descent over full gradient 

descent or stochastic gradient descent? 

" How do you choose a good batch size for your 
implementation? 

Once we have built the neural network model, we can 
train our model with labelled examples. Remember that 
each training loop consists of going over all the training 
samples once. This is typically known as an ‘epoch’. Each 
epoch consists of several batch-sized runs, wherein at 
the end of each batch, the gradients computed are used to 
update the network weights/parameters. In order to ensure 
that the network is learning correctly, we need to measure 
the total loss at the end of each epoch and verify that the 
total loss is decreasing at the end of each successive epoch. 

Now we need to decide when we should stop the 
training process. One simple but naive way of stopping the 
training process is after a fixed number of epochs. Another 
option is to stop training after we have reached a training 
accuracy of 100 per cent. 

Here is a question for our readers: While we can decide to 
stop training after some fixed number of epochs or after a training 
accuracy of 100 per cent, what are the disadvantages associated 
with each of these approaches? We will discuss more on this topic 
as well as on the inference phase in next month’s column. 

If you have any favourite programming questions/software 
topics that you would like to discuss on this forum, please 
send them to me, along with your solutions and feedback, at 
sandyasm_AT_yahoo_DOT_com. Wishing all our readers 


By: Sandya Mannarswamy 


The author is an expert in systems software and is currently 
working as a research scientist at Conduent Labs India 
(formerly Xerox India Research Centre). Her interests include 
compilers, programming languages, file systems and natural 
language processing. If you are preparing for systems 
software interviews, you may find it useful to visit Sandya's 
LinkedIn group ‘Computer Science Interview Training (India)' 
at http://www. linkedin.com/groups?home=&gid=2339 182. 
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BenQreleases & 
two dustproof S 
projectors in India © 


Computing and communication 


Price: 
devices manufacturer, BenQ, has | ee 4 4 000 for BenQ DX808ST and 
recently introduced two dustproof = 86.000 
projectors in India — the BenQ ; 1 2 OUND 
DX808ST and the BenQ MW883UST. 
Both the projectors come with The BenQ DX808ST offers Blu- 
BenQ Dust Guard technology and a ray Full HD 3D support and features 
dustproof protection rating of IP5X, SmartEco power saving technology 
| which helps the devices to endure and 0.61 short throw ratio, making it 
H ’ : dusty environments. : perfect for educational purposes. 
Ze bro nics ‘ Company sources claim the The MW883UST offers wireless 
la te st g amin g BenQ projectors perform well use and interactivity, 3,300 ANSI 
h ea d p ho nes even with the increasing pollution | lumen brightness for clearer and 
: levels in India. The BenQ Dust ‘sharper images, TRO.23 ultra short 
la un ch ed Guard technology uses a sealed throw, and HDMI/VGA power control. 
optical engine design, advanced Both the projectors can be purchased 
Mobile accessories manufacturer, colour wheel sensors, and high- online and at retail stores. 
Zebronics, has launched its latest : performance dust filters which 
gaming headphones called Orion. block out particles up to the level : Address: BenQ India Pvt Ltd, 3rd Floor, 
The lightweight headphones are of PM2.5, thus offering extended 9B Building, DLF Cyber City, DLF 
designed with a microphone and projection without colour decay. | Phase 3, Gurugram, Haryana 122002 


volume control buttons, along with Bette BAe aN Pe oS es te Bc eased seu 
metallic ear cups and LEDs. They 


also come with a padded headband ~=«SS«s CY ISH and detachable speakers 


and a comfortable ear cushion for fro m To reto 
extended hours of gaming. 

With an onboard sound Toreto, a leading gadget manufacturer, 
processor, gamers can experience : has unveiled its latest Bluetooth 
7.1 surround sound. The speakers called Twin Magno. 
headphones come with 3 metres The highlight of the speakers is the 
of braided cable, a flexible True Wireless Stereo (TWS) feature, 
microphone, a USB interface and a magnetic detachable design that 
a 40mm driver unit with a 20Hz to : converts the pair into two separate 
20,000Hz frequency response. speakers, on which different music can 

The inline control can be used to be played at the same time. 
adjust RGB lights, mic. and volume. Apart from Bluetooth V4.2, the Offering a classic look with its 

The stylish and durable device also supports the aux. feature, woodlike finish, the speakers come 
headphones are perfect for long : which lets a user play music even when : with a pouch to carry them safely on 
hours of gaming. The Zebronics not connected via Bluetooth. picnics, excursions, etc. The Toreto 
Orion can be purchased online and The speakers offer an output of 5W Twin Magno is available for purchase on 
at retail stores. x 2, enabling loud and clear voice up the company’s website. 

to arange of 10 metres. The company 

Address: Zebronics India Pvt Ltd, : Claims that the speakers offer up to Address: Toreto Retail Private Limited, 
B-24/3, Phase 2, Okhla Industrial eight hours of playback. A user can also 12A/38, Saraswati Marg, W.E.A. Karol 
Area, New Delhi 110020 play each unit individually. ' Bagh, New Delhi 110005 
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Al powered 
smartphone 
launched in India 


Multinational network and 
telecommunications company, 
Huawei, has introduced its first ever 
Al enabled smartphone in India — the 
HonorView10. 

The highlight of the bezel less 
smartphone is its AI enabled powerful 
Kirin 970 chipset that is packed with 
a neural network processing unit 
(NPU), enabling the device to perform 
extremely well. The smartphone has a 
14.9cm (5.9 inch) full view display, and 
the device is embedded with many new 
Al applications that recognise various 
scenes while taking pictures. 

With AI accelerated Microsoft 
Translator software, the smartphone 
can translate into different languages 
even without an Internet connection. 
The device is also capable of 
optimising performance as per a user’s 
behaviour and requirements. 


For enhancing the mobile gaming 
experience, the smartphone comes 
embedded with a gaming suite. 

The new Kirin 970 chipset features 
an octa-core ARM Cortex CPU anda 
Mali-G72 12-core GPU, and runs on 
Android 8.0 Oreo. The smartphone 
comes with 6GB RAM and 128GB 
ROM, which is further expandable up 


Price: 


= 30,000 


to 256GB via microSD card. On the 
camera front, the smartphone sports a 
20MP + 16MP dual-lens camera on the 
rear and a 13MP front camera. Along 
with the 5V/4.5A Honor Supercharge 
feature, the device is backed with a 
3,750mAh battery. 

The Huawei Honor10 is available 
exclusively at Amazon.in. 


Address: Huawei India, India Region Headquarters, 14th Floor, Tower C, Unitech 
Cyber Park, Sector-39, Gurgaon-122002, Haryana; Ph: 0091-124-4774700 


16-port Gigabit 


Ethernet unmanaged 
switch from Digisol 


Digisol, a subsidiary of Smartlink 
Network Systems, has recently made 
available its latest, compact Gigabit 
Ethernet unmanaged desktop version 
switch—the DG-GS1016D-A, which 
is designed to enhance network 
performance. 

The switch comes with 16 x 
10/100/1000Mbps Ethernet ports 
and uses ‘store and forward’ packet 
switching technology to ensure a 
reliable data transfer. 

Eliminating the use of crossover 
cables or dedicated uplink ports, 
the device supports automatic MDI/ 
MDI-X detection. 


An ideal solution for small Ethernet 


work groups, the switch features 
duplex mode along with auto-sensing 
and auto negotiation of the highest 
available speed (10/100/1000Mbps), 
thus providing an automatic and 
flexible solution for the network. 


ot csisSsotltN] 


With a capacity of 32Gbps, 
the switch offers data transfer 
rates at an average of 2000Mbps 
per port at full duplex mode. 

The feature-rich Digisol 
DG-GS1016D-A is available via 
retail stores. 


Address: Digisol Systems Ltd, Smartlink House, Plot No. 5, Bandra Kurla 


Complex Road, Santa Cruz (E), Mumbai 400098 


The prices, features and specifications are based on information provided to us, or as available 
on various websites and portals. OSFY cannot vouch for their accuracy. 


Compiled by: Aashima Sharma 
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Open Journey 


The journey began with 

‘What Next?’ 

The first time I came to know about 
open source seems like a long, long 
time ago. Let me take you back in 
time! Way back in late 1987, we 
managed to get access to ERNET 
(Education and Research Network, set 
up in 1986 by the then Department of 
Electronics). Apart from access, we 
also got so involved that we started 
helping the ERNET members by fixing 
issues for them. This was when we 
came to know that the global market 
was getting ready with a much more 
affordable, or rather, a free alternative 
to UNIX (a costly affair then). Linux, 
as it was called, excited us. 

In fact, in 1991, I, along with three 
friends, who were all Linux enthusiasts, 
painfully managed to download the 
SLS 0.91, the first Linux distribution 
package onto 41 floppies. After 


As | waited at the coffee shop to 
interview one of India’s pioneering 
open source experts in Gurugram 
(Gurgaon), a slim man walked 
towards me and with a pleasant 
smile, asked if | was Syeda Beenish 
from OSFY. As we chatted about 
technology and open source, | could 
sense that Kishore Bhargava, 
CEO, Linkaxis Technologies, 

and an early adopter of FOSS, 

was an excellent leader. He had 

no airs about him, and his deep 
understanding of technology left 
me spellbound. Over a cup of coffee, 
he recounted his journey in India’s 
software industry and talked about 
how open source has evolved in 
India. The excerpts... 


downloading it, the first thing that came 
to our mind was, “What next?” 

This question created a distance 
between me and Linux, as I did not know 
what to do with it. Years later, I was in 
the US attending a seminar where free 
copies of Slackware were distributed. 

On coming back home I realised the 
enormous potential and the possibilities 
it was open to. And since then, Linux has 
been an integral part of my life. 


| fell in love with open source 
software only on the second 
encounter 

Though open source software excited 
me, we could not go along with it 
initially. But after my encounter with 
Slackware in 1994, there has been no 
looking back. I feel happy to be one of 
the very few early adopters of FOSS in 
India. But after tasting open source, I 
wanted others also to enjoy it and make 
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use of the many benefits it offers. In fact, 
I feel it’s the efforts of many like-minded 
people that helped open source evolve 
from being considered a ‘strange thing’ 
to becoming mainstream. 

I am a technology consultant with 
over 30 years of experience in the 
areas of communications, messaging 
solutions, application design and 
deployment, security systems and 
planning, and strategy. My core 
speciality lies in network architecture, 
security, unified messaging, project 
management, IT planning and 
strategy. I have successfully assisted 
several global corporations, as well 
as non-governmental organisations in 
defining their technology roadmaps 
and installing world-class solutions for 
their operations in India and globally. 
Needless to say, all of this was done 
using free and open source software 
and technologies. 


The need for events and a 
strong community 

In the 1990s, open source was totally a 
new concept. The biggest teething issues 
included compatibility of hardware and 
software. To give you an example, to 
run Linux, you needed specific drivers 
as otherwise the modem would not be 
compatible. For a person like me who is 
a wildlife photographer, there were no 
good open source video editing or photo 
editing software. All this was hampering 
mass adoption. 

In 2000, we initiated events 
like FOSS.In and Freed. In to create 
awareness in Bengaluru and Delhi, 
respectively. These events earned huge 
credibility and contributed a lot to 
the adoption of open source. FOSS.In 
emerged as a premier event with 3000 
people attending this five-day event. 
For the first four years we focused on 
creating awareness and increasing the 
consumption of open source. The next 
six years went into channelising the 
contributions. 

Open source is all about the 
community because, from the start, 
we saw like-minded people coming 
together to use it; so it acted as a 
knowledge or experience sharing 
platform. This is how the idea of 
having ILUGs (India Linux User 
Groups) came to our minds. There 
were already groups in different cities 
working in isolation, and it took 
us some time to bring them under 
one banner with state/city specific 
chapters. 

For 15 years we nurtured the 
community, and then stepped back. 
The community changes every 
year as new talent keeps joining 
and contributing. The graph is only 
witnessing an upward trajectory. 

The community is still thriving; for 
instance, Python has a very active 
community today. 


What is open source? 

Let me clarify that Linux hit the Indian 
market when piracy was at its peak. So, 
the argument of it being free was of not 
much significance either to individual 


Open Journey BiaUr-a 


My definition of open source: For me it will always be, “Show me the code.” 
Favourite book: ‘The Art of Community’ by Jono Bacon 

Past-time: | am an amateur photographer and love bird and wildlife photogra- 
phy. This pursuit takes me to some very interesting destinations. 

Motivation: | am most motivated by helping others. 


66 
Open source is all 
about the community 
because, from the 
start, we saw like- 
minded people 
coming together to 
use it; so it acted 
as a knowledge or 
experience sharing 
platform. 


users or to enterprises. The only mantra 
that worked was the freedom it gave 
the consumers. All our efforts went into 
promoting ‘freedom’ and that’s how we 
came up with an event named Freed. In. 

Also, open source is lot more than 
free code! It’s your code, your baby! 
One should understand that developers 
open source their projects because of the 
trust in the community. They keep on 
developing because they know that it is 
a good practice to develop along with 
people. This concept of the community 
in the open source world is so unique. 

In fact, today the industry recognises 
open source as ‘the need of the hour’ 
and prefers to hire people with an open 
source background rather than those 
certified on proprietary technologies. 


The motivation 

I am happy that I was among the 
founding members for ILUG-Bengaluru, 
ILUG-Delhi and ILUG-Goa. During this 
journey I also wrote a lot to spread the 
word about open source. Interactions 
with the media and TV appearances 

also helped in spreading the message. 
Though the efforts that went into 

this exercise did not bring immediate 
financial benefits, we all knew that 
continuous persistence would pay off, 


some day. My professional strength 
lies in conceptualising, installing 

and maintaining the technology 
infrastructure of large, multi-locational 
organisations with open source in the 
background. 

What keeps me motivated is 
the act of helping others. I believe 
knowledge should be shared. By 
sharing knowledge it does not get 
diminished; rather, it only expands. 
And open source software does 
exactly that — it shares knowledge 
and helps others. 

My hobbies include wildlife 
photography and coffee brewing. 
These two things keep adding 
flavour to my life. 

The one book I always 
recommend is ‘The Art of 
Community’ by Jono Bacon. Open 
source is more about the people than 
it is about the technology. I also 
recommend ‘Revolution OS’ and 
‘The Code’, two very well-made 
documentaries on the subject. 


Tomorrow... 

For me, open source will always be 
what Linus Benedict Torvalds, the 
creator of the Linux kernel once 
said, “Talk is cheap. Show me the 
code.” It is very interesting to know 
that in the 90s India started out as a 
consumer of open source software 
and became one of its largest 
consumers by the 2000s. Today, 
India is a significant contributor 

to the world of open source, and 

I am glad to have witnessed and 
participated in this transition. For 
India, there’s no more ‘talking’; it is 
all about the code now. 

So the journey that began with 
‘What next?’ (ending with a question 
mark) has now reached a point 
where ‘What next!’ ends with an 
exclamation mark. ENDL @ 
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Interview 


Over the years, and particularly in 2017, 
chatbots have found wide applications 
in large enterprises, enabling a better 
understanding of consumer behaviour. 
Innovations in artificial intelligence 

(Al) and machine learning (ML) 
technologies are expected to further 
enhance the features of chatbots. 

In arecent conversation, Ravi Pinto, 
director - product management, 
Oracle Cloud Platform, shared the 
significance of open source in Oracle’s 
intelligent bots with Syeda Beenish 
of OSFY. He also talked about the 
success of ‘Oracle Code’, a free event 

for developers. Edited excerpts... 


Ravi Pinto, 
director-product management, 
Oracle Cloud Platform 
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What are the recent trends 
vis-a-vis the role and scope 
of bots in India? 
Chatbots are the current flavour 
globally. A report by Gartner has 
estimated that by 2019, 40 per cent 
of enterprises will be actively using 
chatbots to facilitate business processes 
using natural language interactions. 

In India, the proliferation of 
smartphones, as well as the growth in 
broadband connectivity and messaging 
apps, is increasing the demand for 
chatbots. Enterprises are looking to 
connect with their customers, partners 
and employees in new and different ways. 
We're currently witnessing increased 
demand from banks, insurance firms and 
travel companies, amongst others. In the 
coming months, we expect bots to play a 
key role in other sectors as well, such as 
healthcare and government services. 


What is the future of Al 
nd chatbots in the Indian 

market? 
According to a recent report, the global 
chatbot market is expected to reach a 
turnover of US$ 1.23 billion by 2025, 
which represents a CAGR of 24.3 
per cent. In India, chatbot adoption is 
growing faster than most of the other 
APAC countries. With more AI getting 
embedded in chatbots, it’s only a matter 
of time before businesses adopt chatbots 
as a revenue generation engine. Just to 
give you an indication of the potential 
and scope in this field, in the last few 
months, AI and chatbots have featured 
in almost all my discussions with 
customers across industries. 


What features of chatbots 
most excite the tech decision 
makers among enterprise 
customers? 
There are various features that influence 
the decision to use chatbots. But 
the major differentiators or factors 
considered are: 
a) Time to market — How easy and 
quickly can you go live? 
b) How robust is the engine? Can it 
handle a variety of languages? Can 


it connect to a variety of channels 
and messaging protocols? 

c) How good are the product’s AI/ML 
capabilities, e.g., what is the F-score 
of the bot? 


A ‘one size fits all’ approach 

might not be the best for 
chatbots. What are your views on 
this? Can you share any use cases? 
At Oracle, we understand that different 
companies have different requirements; 
so we bring in AI algorithms for deep 
learning, cognitive services, dialogue 
and context, and knowledge services, 
which we then fine tune for chatbots. 
With AI powered chatbots, organisations 
can finally deliver convenience and 
personalisation that customers prefer. 
Customers are increasingly noticing 
the difference between companies that 
have true Al-powered learning apps and 
those that don’t. 


GG 
In India, chatbot 
adoption is growing 
faster than most 
of the other APAC 
countries. 
rb] 


Another crucial factor is how quickly 
these bots can be built. Mature chatbot 
platforms such as Oracle’s enable the 
creation of chatbots with minimal coding. 

Bajaj Electricals, one of our India 
based customers, took just three weeks 
to build its pilot chatbot using Oracle’s 
intelligent bots, which are part of 
the Oracle Mobile Cloud Enterprise. 
The company is now in the process 
of executing a soft launch of its first 
customer service/support chatbot. 


What role is Oracle playing in 
he chatbot space? 

Oracle’s new bot-building capability, 
part of Oracle Mobile Cloud Enterprise, 
lets enterprises create new interactive 
customer experiences using drag-and- 
drop tools. Developers define how the 
conversation should flow, what kind 
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of questions customers might ask, 

and which messaging channels—such 
as Messenger, Slack, voice-based 
assistants, or others—customers can 
use. Natural language processing 
capabilities built into the platform 
understand and learn the nuances and 
the context of conversations. And 
developers can use APIs to integrate the 
bot to back-end systems, as a way of 
pulling in data such as team schedules 
and seat availability. 

Oracle’s intelligent bots can support 
many of today’s most popular interaction 
channels including messaging clients 
such as Facebook Messenger, Kik, 
Skype, Slack, and digital voice assistants 
such as Amazon Echo, Amazon Dot and 
GoogleHome. Additionally, Oracle’s 
intelligent bots provide native and 
JavaScript SDKs to extend mobile 
and Web-based applications with chat 
and voice capabilities via Apple Siri, 
GoogleVoice or Microsoft Cortana. 


What is the scope of 

open source in chatbot 
development? 
The core engine of a chatbot is powered 
by AI/ML components for intent 
recognition, natural language processing, 
etc. This is the domain of well-known 
open source projects such as TensorFlow, 
Stanford NLP, etc. Apart from these, open 
source projects such as Apache Spark and 
Kafka are also used by the bot engines. 


With chatbots gaining pace, 

what are the expected hiring 
trends? 
Intelligent bots will transform every 
facet of every industry and dramatically 
improve the customer experience. In 
fact, chatbots can upskill employees 
with the most trending technology and 
also help them with relevant operations 
management. Apart from this, they can 
manage multiple tasks at the same time. 

The skillsets needed for building 

chatbots are varied. These range from 
UX expertise (how do you engage 
a user in the absence of a GUI) to 
expertise on domains—to understand 


the user’s psyche and the range of 
customer interactions. 


What technologies are Indian 
developers focusing on? 
We see a very vibrant and active 
software development scenario in 
India, with developers working across 
the entire spectrum of technology. We 
find there’s more developer appetite 
for cutting-edge technologies such 
as containers, chatbots, etc, when 
building innovative applications. 


66 
This year, ‘Oracle 
Code — Live 
for the Code’ is 
scheduled to be 
held in Hyderabad 
and Bengaluru 
on April 4 and 10, 
respectively. 


There are three trends most 
noticeable among Indian developers. 
First, cloud native development using 
Microservices is set to go mainstream. 
Microservices based development 
is no longer a new buzzword, but an 
established best practice to deliver 
products quickly to the business in 
the digital era. This boosts agility, 
giving enterprises a significant 
competitive edge. 

Second, real chatbot applications 
with natural language processing will 
become the norm this year. Using 
chatbots, existing applications can be 
extended to newer channels, while newer 
applications can interact in novel ways. 

And third, Node.js and JavaScript 
will continue to build a massive 
following. JavaScript has become 
the ubiquitous language with both 
client and server-side programming 
capabilities. Most new applications are 
built using JavaScript frameworks such 
as React, Angular, etc, with Node.js 
on the backend. Polyglot applications 
continue to remain a focus for 
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developers looking to choose from a 
multitude of programming languages. 
Above all this, open standards, 
open platforms and open source 
remain a key priority, since 
developers want zero lock-in. In 
line with Oracle’s commitment to 
open standards and open source, the 
company recently announced three 
new open source container utilities 
— Smith, a secure microcontainer 
builder; Crashcart, a microcontainer 
debugging tool; and Railcar, an 
alternative container runtime. 


How is Oracle engaging 
with the Indian developer 

community? 
With the democratisation of coding, 
Oracle is committed towards developing, 
supporting, and promoting various 
technologies for developers. India is a 
critical market for us, and Oracle Cloud 
Platform validated for India Stack is a 
testimony to our commitment. We’ Il 
continue to engage with and empower 
the vibrant developer community here to 
build world-class products. 

Oracle, which has been successfully 
running ‘Oracle Code’ globally, acts as 
a learning platform for technical experts 
and industry leaders, as well as a platform 
where developers get to exchange 
experiences. The company has brought its 
global flagship developer event to India, 
conducting it in Bengaluru and Delhi in 
2017. Such events are an opportunity for 
developers to experience easy, modern 
and open cloud development technology 
with workshops and other live, interactive 
experiences and demos with global tech 
evangelists. Oracle Code not only acts 
as a learning platform but is used to 
exchange ideas and share experiences too. 

To experience cloud development 
technology with workshops and other 
live, interactive experiences and 
demos, Oracle once again invites the 
developer community to participate 
in this event in 2018. ‘Oracle Code — 
Live for the Code’ is scheduled to be 
held in Hyderabad and Bengaluru on 
April 4 and 10, respectively. [i at} 
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Typically, it’s large 
enterprises that adopt 

the latest technologies, 
especially since they have 
deeper pockets. Till now, 
machine learning (ML) was 
believed to be expensive 
and perceived to address 
the needs of only big firms 
handling Big Data. This 
article introduces readers 
to an ML solution that’s 
designed for the needs and 
budgets of SMEs. 


Odoo is a suite of open 
source business apps that 
help you grow your busi- 
ness. This comprehensive 
suite includes business 
applications for sales, 
CRM, project manage- 
ment, warehouse man- 
agement, manufacturing, 
financial management 
and human resources — 
all under one roof. 


Implementing machine learning on a small scale 

In the next few years, machine learning (ML) is going to rule most industries and 
organisations, ranging from small scale units to large enterprises. This technology 
has already started to influence the operations of businesses, even though only 

a few people are familiar with it at present. However, the actual success of a 
technology is measured by how far it spreads its wings in all the fields of industry, 
irrespective of the size of an organisation. 

The current differences in the adoption rate of ML, with small scale operations 
rarely using ML and large scale industries being open to it, will get eradicated by 
the evolution of Axcensa’s inventions. One of these is the Odobot and another is 
the Axcensa Learning System. The latter performs the activities of a data analyst, 
translating the data it has in the form of numbers into plain text. This is displayed 
through the Odobot interface, which is a chatbot. It is wholly based on ML concepts, 
and exclusively integrated with Odoo ERP. 

Odobot gives small businesses the capability to make predictions. The data 
collected from their business is studied and processed by the Axcensa Learning 
System. Machine learning serves this purpose and by using Odobot, SMEs get the 
information to make informed and intelligent decisions based on this data. 

The use of Odobot will produce more positive outcomes with increasingly precise 
predictions made through the Axcensa Learning System. These outcomes are defined 
by what matters most to companies—higher sales and increased efficiency. 


Making data analysts and Big Data redundant 

With Odobot, data analysts will no longer be needed. Among small scale 
industries, the data analyst is considered as a minimal requirement and many 
SMEs find this a costly affair. Now, the Axcensa Learning System helps you by 
performing the tasks of a data analyst. 

Nowadays, Big Data analytics impacts almost all our day-to-day activities without 
our knowledge. The Axcensa Learning System acts as a Big Data analyst (as well as a 
‘small data’ analyst) by examining not only the large data sets. This system helps small 
scale industries by uncovering the hidden patterns in their operations and making the 
necessary correlations to help the organisation make more informed decisions. 

Even without the presence of a data analyst by your side and with no Big Data, an 
intelligent genie-like solution like Axcensa Learning System can help you, working in 
the background to enhance sales and productivity. Such a system provides you with 
the knowledge required to make the best business decisions, in time. The Mr Odobot 
interface will flash all the necessary information to you. 

The bottom line is that small businesses will benefit from machine 
earning by saving costs, making better decisions, and earning more profits 
whenever Odobot acts as an interface. 

This is a great opportunity for SMEs to be part of this interesting field and 
0 effectively implement ML technology. It might take your business to the 
next level in a relatively short period of time. ENDL @ 


By: Chandrakumar 


The author is the CEO of Axcensa Technologies and loves to explore new technologies. 
He can be reached at chandrakumarm@axcensa.com 


+ 


www.OpenSourceForU.com | OPEN SOURCE FOR YOU | FEBRUARY 2018 | 27 


ym emai Overview 


The Top Five Open Source 
Project Management Tools 
for Your Business 


Project management software tools enable efficient management of business 
processes. In this article, we take a look at the most effective open source project 
management tools that businesses can implement to achieve excellence. 


project is defined as a collection of works or activities 
that are expected or planned to be completed 


within a predetermined period of time and within a 
predefined budget. A project is characterised by a detailed 
set of tasks, which is also known as the scope of the project. 
It also has an expected time period, and a budget. Even in 
one’s daily or professional life, there are many tasks that 
require us to follow many steps to successfully complete 
them. For instance, taking a car out of a garage, driving it to 
the office and parking it there is regarded as a combination 
of activities and can also be termed a ‘project’. The term can 
also be applied to one-time tasks with a definite beginning 
and ending. A project involves doing the task in a proper 
sequence and following the project life cycle, which consists 
of the concept, design, execution, implementation and the 
commissioning or handing over of the project. 


oP OpenProject. welaui” 


In the real world, teams have to handle multiple projects 
at the same time, and it is not possible to simply rely on 
human memory to keep everything organised. And storing 
all data about a project in one place like an email folder or 
on a single computer as a simple Word file is not practically 
possible. To deliver the projects on time and within the 
predefined budget, information has to be sorted, documented 
well, deadlines marked and documents shared among the 
team members. It is highly important to have seamless and 
continuous communication among all team members. In 
order to manage the projects in an effective manner, the only 
solution is to deploy project management software. 

Project management software solutions are regarded as 
effective tools to provide the best-in-class organisational 
value and make managing a business more professional. 
Such software provides a common platform, right from 


28 | FEBRUARY 2018 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com 


scheduling and assigning tasks to tracking performance, 
finance and team tuning, making it a one-stop solution for 
all organisational needs. 

Recently, small businesses have started adopting SaaS 
based online cloud project management solutions as these 
keep track of all the key variables and help companies to 
deliver their projects on time and within budget. With the 
implementation of high-end security, cloud hosting and 
unlimited integration, various project management solutions 
are becoming more affordable, and many of them are 
completely free to use. 

Let’s now explore effective open source project 
management tools that businesses can implement. These 
tools are open source and currently deployed in various 
startups, SMEs and even in some large scale organisations. 
By researching the extent of their implementation in the 
real world, we arrived at the following top five open source 
project management tools that provide a comprehensive set 
of useful features for small businesses. 


OpenProject 

This is a Web based project management solution which 
was designed using Ruby on Rails and Angular JS. It was 
released under the GNU General Public License Version 
3 and is under consistent development by the open source 
community. It is highly suitable for location-independent 
team collaboration. 

OpenProject is ideal for project teams to work in 
throughout the project’s life cycle. The platform offers 
various features like collaborative project planning, timeline 
reports, management of tasks, reporting of project costs 
incurred from time to time, Scrum and much more. 

This open source project management tool is 
popular because of its intuitive user interface, powerful 
documentation and enhanced features. 


Features 

" Project planning and scheduling: OpenProject lets 
project managers easily define the project’s objectives 
and work specifications. The tool analyses the required 
activities and creates a professional project plan that 
demonstrates how and when the project will be delivered, 
on time and within the scope. 

"Product roadmap and release planning: This feature 
enables project managers to visualise, do proper planning 
and communicate the project’s activities to the team 
members. It also helps to strategise the project roadmap 
by enabling transparency among these members. 

" Effective task management and team collaboration: 
OpenProject is a powerful tool to enable teams to keep 
consistent track of the work done as well as the work 
pending, and get effective results. It enables the team 
members to organise and assign tasks, and effectively 
communicate these at any single point of time. 
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" Agile and Scrum integration: Considering that agile 
based projects have a shorter development period, 
OpenProject provides a Scrum based platform for agile 
teams to build, measure and learn on. The teams can 
prioritise and do effective task tracking. This feature 
integrates high-end modules like roadmap planning, bug 
tracking and task management. 

"Time tracking, budgeting and reporting of costs: 
OpenProject creates an ideal platform to build projects 
and effectively track all activities. Progress can be tracked 
from a cost perspective to enable timely budgeting. 

" Bug tracking: OpenProject provides add-on benefits by 
ensuring quality via tracking tasks and through effective 
communication. With OpenProject, quality assurance 
managers can easily capture, differentiate and prioritise 
the bugs in projects. 

In addition to the above, OpenProject provides 
add-on features via cloud linkage especially designed 
for large scale enterprises. 


Latest version: 7.2 
Official website: https://www.openproject.org/ 


Figure 1: The OpenProject GUI 


ProjectLibre 
ProjectLibre is open source, free to use project management 
software developed by Marc O’ Brien as an alternative to 
Microsoft Project. It’s used for task management, resource 
allocation, tracking of tasks, Gantt charts and much more. 
ProjectLibre is developed using Java and provides a similar 
user interface as Microsoft Project. It includes a ribbon-style 
menu and the same series of steps to create a project plan, 
ie., create an indented task list or ‘work breakdown structure’ 
(WBS), set durations, create links, assign resources, etc. 
ProjectLibre can easily run on all platforms like Windows, 
Linux and Mac OS. 


Features 

= Compatible with MS Project: ProjectLibre is fully compatible 
with MS Project 2003 (till date version files). It has almost 
the same user interface and approach for creating a project 
plan. It enables project managers to do the following: create 
an indented task list, work breakdown structure, project 
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duration, links, predecessors and resources. It also creates 
budgets and effectively manages expenses. 
= Advanced project management functionality: ProjectLibre 
can handle single as well as multiple projects with a 
wide set of tools like a calendar and user interactive 
Gantt charts; it provides powerful scheduling and 
hierarchy features. It also provides templates for quick 
starts, and supports user stories and version control for 
agile methods. It keeps track of records and history, 
and provides other features like PERT charts, network 
diagrams and other budgeting features. 
= Advanced collaboration, issue tracking and other 
features: ProjectLibre supports enhanced social media 
linkage, forums, Web conferencing, etc. It also supports 
code integration and multiple workflows for issue 
tracking. It can even export PDFs without any restrictions. 
ProjectLibre will soon support ProjectLibre Cloud, 
with a similar interface as Google Cloud, which will 
enable users to manage and create projects anytime and 
anywhere, in the browser. 


Latest version: 1.7 
Official website: https://www.projectlibre.com 
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Figure 2: ProjectLibre user interface 


oDO00 


ODOO, a comprehensive all-in-one project management 
software, was designed by Fabien Pinckaers and is packed with 
a complete suite of enterprise management applications. It is an 
all-in-one business software, which includes CRM, website/e- 
commerce, accounting, manufacturing, warehousing, project 
management, inventory, etc. The complete source code of 
ODOO is available on GitHub under AGPLV3. 

ODOO consists of an application server, which uses 
PostgreSQL as its backend database with a Web based client. 
It is written in Python, with a highly modular design, which 
allows rapid development of new modules through Open 
Object RAD. In addition, ODOO consists of 30 core modules 
and 3000+ community modules. It also provides strong 
technical support, as well as support for bug fixing and new 
development, apart from other services. 


Features 

" Interactive user interface: ODOO provides an interactive 
user interface, which is mobile friendly to track projects 
and perform tasks easily. It also has a fully customisable 
project process that can filter tasks in a smart way. 

" Project tasks: This feature comprises a comprehensive 
set of project management tasks like customised 
Kanban view, Gantt charts, graphs, pivot table analysis, 
time tracking, document management, multi-project 
management and calendar deadlines. 

" Enhanced communication features: These include features 
like email integration, custom alerts, user chat groups, live 
collaboration and activities logs. 

"Customer services: ODOO is fully equipped with 
timesheets, forecasts and a customer portal for successful 
project implementation. 

" Project reporting: This software is tightly integrated to 
provide enhanced project management reporting features 
like dashboards, tasks analysis and issues analysis. 

" Smart integration: ODOO has powerful APIs that 
do almost everything, and it integrates with Google 
Docs for linking all the documents and advanced 
accounting features. 


Latest version: 11.0 
Official website: https:/www.odoo.com 


Figure 3: ODOO user interface 


Wekan 

Wekan is free, open source project task management and 

collaboration software that uses the Kanban approach for simple 

and fast workflow. With Wekan, project managers can create 

boards on which cards can be dragged around between columns. 

It is very easy to use and has interactive project management 

software. After the creation of boards, users need to simply add 

the project team to the project and everything is set to take off. 
Wekan is released under the MIT License and is open 

for modifications and enhancements. It gives users full 

control over their data and can be hosted on the server. 

The Wekan community consists of over 400,000 members, 

which means there is a vibrant community working 

towards further improvements. 
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Features 

= Enhanced task management with interactive GUI and 
easy, customisable options. 

= Kanban boards that allow a card-based task management. 

= Can be easily installed via Docker, Sandstorm, Cloudron, 
Ubuntu Snap, Source and even Debian packages. 

= Open source with interactive timeline. Available as an app 
for iPhone and Android. 


Latest version: 0.63 
Official website: https://wekan.github.io/ 
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Figure 4: Wekan user interface 


ZenTao 
ZenTao has been designed by the Agile team for agile based 
software projects. It is regarded as a complete life cycle 
Management tool and supports Scrum. 

ZenTao includes the SaaS version, ZenTao Cloud, 
which enables project managers to keep track of projects, 
anytime and anywhere. 


Features 

= Product management: Products, stories, plans, releases 
and roadmaps. 

= Project management: Projects, tasks, teams, builds and 
burndown charts. 

= Quality management: Bugs, test cases, test tasks and 
test results. 

= Document management: Product document library, project 
document library and customised document library. 


Overview Beye G 


"Work management: To-do tasks and personal work 
management. 

= Organisation management: Departments, users, 
groups and privileges. 

"Reports: Statistical reports. 
It also has a Search feature. 


Latest version: 9.5.1 
Official website: /ttps://zentao.pm 


Figure 5: ZenTao user interface 
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Top Ten Android 
Smartphones and Tablets 


Itis a bit difficult to choose a smartphone today, as there is a bewildering variety of 
phones in the market. This article will help the reader make an intelligent choice. 


ndroid is a mobile operating 
As designed and developed 

by Google (initially developed 
by Android Inc.), basically for 
touchscreen mobile devices such as 
smartphones and tablets. There are 
multiple Android operating systems 
available, the latest version being 
Android 8 Oreo. The Android operating 
system is written in Java, C and C++. 
It is also available as Android TV for 
televisions, Android Auto for cars, and 
Android Wear for wrist watches. 

The basic features available in 

Android devices are shown in Figure 2. 


Main characteristics of a 
smartphone or a tablet 

There are many questions a potential 
buyer asks a dealer before purchasing 
a smartphone or a tablet. It is worth 


discussing the main features that one 
should check before deciding on a 
particular device. 

Display size and resolution: Display 
has two dimensions — size and resolution. 
If you are a regular user, you need a 
smartphone for checking emails, as well 
as chatting and browsing social media; in 
which case, anything from a 12.7cm to 
13.9cm (5-inch to a 5.5-inch) HD, or full- 
HD display handsets are perfect for you. 
Otherwise, you can opt for a tablet. 

Processing power: What the heart 
does for a human, the processor does 
for Android devices. The processing 
power of a device varies from one 
device to another, depending on several 
factors such as the OS version, UI, 
bloatware and more. 

Camera — capturing moments: 
Most users want to know how many 
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megapixels the camera has. But just 
having a higher number of megapixels 
does not mean that the camera is better. 
Several other specifications, such as 
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Figure 1: Android versions 


Figure 2: Android features 


camera aperture, ISO levels, pixel size, 
auto-focus and more, are also important. 

Battery — the energy lifeline: It is 
important to have a battery with staying 
power. If you are a heavy user and work 
on apps, play games, stream videos and 
more, then go for a smartphone with at 
least a 3500mAh battery or above. If you 
are an average or light user, a handset 
with a 3000mAh battery would be good 
enough to run for a full day. 

The build — the benchmark for 
durability: The build is directly linked 
with the durability of a smartphone. The 
entire handset market is broadly divided 
into two types of builds - metal and 
plastic. Which option you select should 
be based on how you evaluate your own 
usage pattern — do you tend to be very 
careful with your devices or are you 
clumsy and rough?. 

User interface — it’s all about 
interaction: The user interface and 
OS version are key factors to choosing 
the best devices. The interface is that 
part of the phone that you interact with 
each time you access any function, so it 
should be easy and simple. 

Storage — when data is a prized 
commodity: Nowadays, storage is 
available in options of 16GB, 32GB, 
64GB or more. Higher storage volumes 
are mainly required by users who store 
movies or play heavy games. 

Security — the main concern: Our 
smartphone contains private data and 
pictures that we don’t want to share 
with anyone. So, security features are 
important and can include fingerprint 
sensors or even iris sensors. 

Service — the connectivity: Another 
factor to consider when selecting a 


carrier and an Android phone is whether 
or not it supports the newer, high-speed 
4G networks. 

Design — the look and the feel: A 
very important feature is the look and 
feel of a device. If the device fits in 
your hand then it is easy to access its 
features. 


Top ten smartphones 

and tablets 

In this section we present, what in my 
opinion are the ten top smartphones and 
tablets currently. 


1. Google Pixel XL 
Google Pixel XL was launched in 
October 2016, and is made by Google 
itself. In India, the price of the phone 
starts from & 38,198. It is available 
on Amazon, Flipkart or its official 
website. Google 
Pixel XL has a 
1.6GHz quad- 
core Qualcomm 
Snapdragon 
821 processor, 
which is one of 
the latest and is 
fast, and 4GB of 
RAM. Google 
Pixel XL runs 
the 7.1 version 
of Android, and 
has an excellent 
performance in 
all areas, including battery life. It has a 
single SIM slot. One drawback of this 
device is that its storage capacity cannot 
be expanded. 

Please refer to the official website 
for further queries or to buy the 
smartphone: https://store.google.com/ 
product/pixel_phone 


Figure 3: Google Pixel XL 


2. OnePlus 5T 

OnePlus 5T was launched in November 
2017. In India, its price starts from 

& 32,999. The OnePlus 5T is driven 

by a 2.45GHz octa-core Qualcomm 
Snapdragon 835 processor and 6GB 

of RAM (which is the same as your 
laptop). It provides the dual SIM (GSM 
and GSM) option, and takes a Nano 
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SIM in each slot. It offers users super 
connectivity 
options 
including 
Wi-Fi, GPS, 
Bluetooth, 
NFC, USB 
OTG, 3G, 

AG, etc. 
However, one 
disadvantage of 
this smartphone 
is that the 
photo quality in 
daylight is not 
the best. 

Please refer 
Figure 4: OnePlus 5T to the official 
website for further queries or to buy the 
smartphone: https://oneplusstore.in/St 


3. Nokia 8 
Nokia 8 was launched in August 2017. 
In India, its 
price starts from 
& 32,899. This 
phone has a 
1.8GHz octa- 
core Qualcomm 
Snapdragon 835 
processor and 
AGB of RAM. 
It is a dual SIM 
device that takes 
a Nano SIM in 
each slot. Its 
connectivity 
options include 
Wi-Fi, GPS, 
Bluetooth, NFC, USB OTG, 3G and 4G. 
The two drawbacks of this smartphone 
are its poor camera performance in low 
light and that it is not fully waterproof. 
Please refer to the official website 
for further queries or for buying the 
smartphone: https://www.nokia.com/ 
en_in/phones/nokia-8 


Figure 5: Nokia 8 


4. Sony Xperia XZ Premium 

The Sony Xperia XZ Premium was 
launched in February 2017. It is 
designed by Sony Mobile (previously 
known as Sony Ericsson Mobile). In 
India, the price of the phone starts from 
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% 59,990. The device has an octa-core 
Qualcomm Snapdragon 835 processor 
——————— ) and 4GB of 
RAM. It is a dual 
SIM smartphone 
that takes a Nano 
SIM in each slot. 
Its connectivity 
options include 
Wi-Fi, GPS, 
Bluetooth, NFC, 
USB OTG, 3G 
and 4G. It runs 
on the latest 
Android 7.1.1. 
The speciality 
of this smartphone 
is its good build 
quality and that it is water-resistant. 
It doesn’t overheat under any 
conditions. On the other hand, it is 
difficult to maintain the SIM card tray 
in this phone. 
Official website: https://www. 
sonymobile.com/in/products/phones/ 
xperid-xz-premium/ 


Figure 6: Sony Xperia 
XZ Premium 


5. Oppo F5 
Oppo F5 was launched in October 
2017. It is designed by Oppo 
Electronics Corp. In India , the price 
of the phone 
starts from & 
18,299. It has 
an octa-core 
MediaTek 
MT6763T 
processor and 
4GB of RAM. 
The Oppo F5 
is a dual SIM 
smartphone 
that takes two 
Nano SIMs. Its 
connectivity 
options include 
Wi-Fi, GPS, Bluetooth, USB OTG, 
FM, 3G and 4G. It runs on Android 
7.1.1. The best thing about this 
smartphone is the good front and rear 
camera quality; so it is also called the 
‘Selfie Expert’. 

Official website: https://www.oppo. 
com/in/smartphone-f5 
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Figure 7: Oppo F5 


6. Honor Pro 8 
Designed by Huawei, the Honor 8 Pro 
was launched in 
April 2017. In 
India, it sells from 
& 25,500 onwards. 
The phone has a 
1.8GHz octa-core 
Kirin 960 processor 
and 6GB of RAM. 
It is a dual SIM 
device that takes 
two Nano SIMs. 
Its connectivity 
options include Wi- 
Fi, GPS, Bluetooth, 
FM, 3G and 4G. 
This device runs on Android 7.0. Its 
speciality is its good camera performance, 
good battery life and lots of storage space. 
Official website: http://www. 
hihonor.com/in/products/mobile-phones/ 
honor8pro/ 


Figure 8: Honor Pro 8 


7. Xiaomi Redmi Note 4 
Xiaomi Redmi Note 4 was launched in 


August 2016, and is designed by Xiaomi. 


In India, the phone is priced from ¥ 9,999 
onwards. It has 
a 2GHz octa- 
core Qualcomm 
Snapdragon 625 
processor and 
AGB of RAM. 
The Xiaomi 
Redmi Note 4 
has a dual SIM 
slot that accepts 
a Micro SIM 

_ anda Nano SIM. 
2 Its connectivity 


Figure 9: Xiaomi Redmi Note 4 options include 


Wi-Fi, GPS, 


Bluetooth, infra red, USB OTG, FM, 3G 
and 4G. This smartphone runs on Android 


6.0. Its main drawback is that it has a 


hybrid SIM slot. 


Official website: http://www.mi.com/ 
in/note4 


8. Samsung Galaxy Tab S3 
Designed by Samsung, the Galaxy Tab 


S3 (LTE) was launched in February 
2017. In India its price starts from 
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& 47,990. The tablet has a 1.6GHz 
quad-core Qualcomm Snapdragon 


Figure 10: Samsung Galaxy Tab $3 


820 processor and 4GB of RAM. The 
Samsung Galaxy Tab $3 (LTE) has 
only a single SIM. Its connectivity 
options include Wi-Fi, GPS and 
Bluetooth. Good in cellular data 
connectivity, the tablet includes a 
S Pen. However, this device allows 
limited multi-tasking. 

Official website: http:/Wvww.samsung. 
com/global/galaxy/galaxy-tab-s3/ 


Figure 11: Asus ZenPad 3S 10 


9. Asus ZenPad 3S 10 
Asus ZenPad 3S 10 (Z500M) was 
launched in July 2016. Designed by 
ASUSTeK Computer Inc., the tablet 
has a 1.7GHz hexa-core MediaTek 
MT8176 processor and 4GB of RAM. 
Its connectivity options include 
Wi-Fi and Bluetooth. Sensors on the 
device include a proximity sensor, 
accelerometer, an ambient light sensor 
and gyroscope. The Asus ZenPad 3S 
10 runs on Android 6.0. 

Official website: https://www.asus. 
com/Tablets/ASUS-ZenPad-3S-10- 
Z500M/ 
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The top ten smartphones and tablets at a glance 


Phone Display Processor Camera Battery Build Ul Storage Security SIMs 
Google 13.9cm (5.5 Qualcomm R: 12.3 MP 3450 Metal, plastic Android 32GB Face detec- 1 
Pixel XL inches) Snapdragon 621, F: 8 MP mAh and glass re internal tion 

1440 x 2560 1.6GHz, quad-core Nougat 4GB RAM 
One Plus 15.2cm (6.01 Qualcomm R: 20 MP 3300 Full metal Android 62GB Face detec- 2 
5T inches) 1080 x Snapdragon 835, F: 16MP mAh KA internal tion, 

2160 2.45GHz octa-core Nougat 6GB RAM fingerprint 

scanner 

Nokia 8 13.4cm (5.30 Qualcomm R: 18 MP 3090 Aluminium Android 64GB Fingerprint a 

inches) 1440 x Snapdragon 835, F: 13 MP mAh unibody Kalet internal detection 

2560 1.8GHz octa-core design with 4GB RAM 

metal 

Sony 13.9cm (5.50 Qualcomm Snap- R: 19MP 3230 Metal body Android 64GB Fingerprint 2 
Xperia XZ | inches) 2160 x dragon 835, F: 13 MP mAh tall internal scanner 
Premium 3840 octa-core 4GB RAM 
Oppo F5 15.2cm (6.00 MediaTek MT6763T R: 16 MP 3200 Metal edges Android 32GB Fingerprint 2 

inches) 1080 x octa-core F: 20 MP mAh ean internal detection, 

2160 4GB RAM face recog- 

nition 

Honor 8 14.4cm (5.70 Kirin 960 R: 12 MP 4000 Metal body Android 128GB Fingerprint 2 
Pro inches) 1440 x 1.8GHz F: 8 MP mAh 1 internal detection 

2560 octa-core 6GB RAM 
Xiaomi 13.9cm (5. 50 Qualcomm Snap- R: 138 MP 4100 Metal body Android 64GB Fingerprint 2 
Redmi inches) 1080 x dragon 625, 2GHz F: 5 MP mAh 6.0 internal detection 
Note 4 1920 octa-core 4GB RAM 
Samsung | 24.6cm (9.70 Qualcomm R: 18 MP 6000 Metal body Android 32GB Fingerprint 1 
Galaxy inches) 2048 x Snapdragon 820, F: 5 MP mAh 7.0 internal detection 
Tab S3 1536 1.6GHz octa-core 4GB RAM 
Asus 24.6cm (9.70 MediaTek MT8176, R: 8MP 5900 Metal body Android 32GB Fingerprint 2 
ZenPad inches) 1536 x 1.7GHz F: 5 MP mAh 6.0 internal detection 
38S 10 2048 hexa-core AGB RAM 
Lenovo 25.6cm (10.10 Intel Atom X5, R: 138 MP 10200 Metal body Android 32GB - 1 
Yoga Tab inches) 2560 x 2.24GHz F: 5 MP mAh 5.1 internal 
3 Pro 1600 quad-core 2GB RAM 

10. Lenovo Yoga Tab 3 Pro options include Wi-Fi, GPS, 


Lenovo Yoga Tab 3 Pro was 
launched in September 2015. 
Designed by Lenovo, its price 

in India starts at ¥ 34,900. The 
Lenovo Yoga Tab 3 Pro has a 
2.24GHz quad-core Intel Atom 
X5 processor and 2GB of RAM. 
It runs on Android 5.1, and has a 
single SIM slot. Its connectivity 


Figure 12: Lenovo Yoga Tab 3 Pro 
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Bluetooth, USB OTG and 4G. 
Official website: 
https:/www3.lenovo.com/in/en/ am} 


hagyashri Jain 


The author is a systems engineer and 
loves Android development. She likes to 
read and share daily news on her 

blog at http://bjlittlethings. wordpress.com 
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Open Source Technologies 
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ANALYSIS 


This article covers emerging open source technologies—the many trends that drive 
success today and will do so in the future. These range from how the entire tech industry 
is getting reshaped by digital transformation and the makeover of all tech enterprises 
from ‘Digital Immigrants’ to ‘Digital Natives’, to the rise of innovation accelerators and the 
emergence of next generation open source tech platforms. 


ith open source technology having been around 
since quite a long time now, a thriving global 
community has grown around it; so code is shared 
among developers and everyone can test, re-build and learn 
from each other. As industry has begun adopting open source 
technologies in almost all verticals, many new technologies 
have used open source as their very foundation. 
Here are some of the areas in which groundbreaking 
open source technologies are set to revolutionise 
the world as we know it. 


1. Open source in machine learning 

Machine learning (ML) is the study of algorithms that use 
large data sets to learn, generalise and predict. The most 
exciting aspect of ML is that with more data, the algorithm 
improves its predicting power. ML has acted as a strong base 
for self-driving cars, speech recognition, home automation 
products and much more. Machine learning is closely related 
to computational statistics and also focuses on making 
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predictions via computers. It is regarded as an effective 

method for deploying complex models and algorithms that 

lend themselves to prediction in the commercial space -- this 
is known as predictive analysis. Machine learning tasks 

are broadly categorised into supervised learning (semi- 

supervised learning, active learning and reinforcement 

learning) and unsupervised learning. 

The most recent ML engines that have been open sourced 
by various IT giants are Google Cloud Machine Learning 
Engine, TensorFlow by Google, Amazon’s ML engine for 
AWS, Unity ML Agents, Apache PredictionIO, Microsoft 
Distributed Machine Learning Toolkit, etc. 

" Google Cloud Machine Learning Engine: This is regarded 
as a managed service enabling users to easily build 
operational ML models to work on any type or size of 
data. It makes TensorFlow Model perform large scale 
training on managed clusters, and is also equipped to 
manage the trained models for large scale online and 
batch predictions. It is integrated with Google Cloud, 
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allowing users to access data on Google Storage. 

= Unity Machine Learning Agents: The Unity ML Agents 
SDK allows developers and researchers to transform 
games and simulations developed via Unity’s editor into 
environments, where intelligent agents can be trained 
effectively using deep reinforcement learning. 

= Apache PredictionIO: This is an open source ML stack 
integrated with Apache Spark, MLLib, HBase, Spray 
and Elastic Search to create predictive engines for 
all sorts of ML tasks. 

= Amazon Machine Learning: The Amazon ML 
recommendation engine is used to power Amazon Echo/ 
Dot, powered by Alexa, Drone-Prime Air, Amazon Go 
and AWS based cloud services. 

"Microsoft Distributed Machine Learning Toolkit: This 
toolkit by Microsoft provides a framework for training 
models of ML on Big Data. The toolkit contains both 
algorithm and system innovations to make tasks on Big 
Data highly scalable, efficient and flexible. 


2. The R programming language 

R is a free and open source ML language supporting statistical 
computing and the graphics language on a wide range of 
operating systems. It provides diverse statistical functionalities 
like linear and non-linear modelling, classical statistical tests, 
time-series analysis, classification, clustering and advanced 
graphical techniques. It is a highly dynamic, scalable and 
extensible language that provides advanced features for data 
manipulation, calculation, graphics display, array calculation, 
data analysis tools, etc. It is a programming language that also 
contains conditions, loops and other capabilities. 

R contains more than 11,000 packages of various kinds, 
which are available through the Comprehensive R Archive 
Network (CRAN), Bioconductor, Omegahat, GitHub and 
other repositories. 


3. Emerging trends in blockchains and Bitcoin 
Blockchain technology is rapidly undergoing intense 
development due to great interest from academia and the 
industry sector. A blockchain is regarded as a shared, open 
source transactional database for tracking transactions of 
digital currency like Bitcoin. 

According to Don and Alex Tapscott, “The blockchain is 
an incorruptible digital ledger of economic transactions that 
can be programmed to record not just financial transactions 
but virtually everything of value.” 

A blockchain is a continuously growing list of 
records called blocks which are linked and secured using 
cryptography, with every block having a hash pointer as a link 
to the previous block, a time stamp and transaction data. 


Blockchain platforms 
= ERIS: This allows everyone to create their own secure, 
low-cost application (that can run anywhere) using a 
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blockchain and smart contract technology. 

"  HydraChain: This is an extension to the Ethereum 
platform, with added support for creating permissioned 
distributed ledgers. 

= MultiChain: This enables organisations to rapidly design, 
deploy and operate distributed ledgers. 

= OpenChain: This is open source distributed ledger 
technology suited for organisations working on 
issues and managing digital assets in a highly 
secure and flexible manner. 

" Ethereum Project: This is a decentralised platform 
running smart contracts: Applications that run as such 
are programmed without any limitations of downtime, 
censorship, fraud or third party involvement. 

= Hyperledger: This is an open source collaborative effort 
to advance cross-industry blockchain technology and 
ensure transparency, interoperability and support to bring 
blockchain technologies for commercial adoption. 


4. Open source and the Internet of Things (loT) 
The Internet of Things (IoT) is highly fragmented and 
changing continuously. Open source is playing a crucial role 
in creating IoT platforms as well as ready-made prototypes in 
terms of development boards for R&D and automation. 

IoT standards, together with Artificial Intelligence (AD), 
are controlling and interpreting a wide range of activities in a 
smart manner. Without the use of open source technologies, 
there is no Web. IoT requires the same level of ubiquitous 
common access in its core functions and to the Web for 
shared accessibility. loT deployments in the near future are 
expected to connect and integrate globally with billions of 
devices, assets, sensors and end points. 

IoT platforms are regarded as the middleware layer 
between IoT devices or end points and services that consume 
data outputs. These platforms offer sophisticated end point 
management to control the devices. 


Open source IoT platforms 

" Kaa IoT: This is an efficient, open source and cloud 
based IoT platform which enables data management 
for connected objects and back-end infrastructure by 
providing server and endpoint SDK components. 

"  SiteWhere: This provides ingestion, storage, processing 
and integration of device data. It runs on core servers 
provided by Apache Tomcat, and contains MongoDB and 
HBase implementations. 

" ThingSpeak: This allows users to collect and store 
sensor data in the cloud and on popular IoT applications 
development platforms. It works well with Arduino, 
ESP8266, BeagleBone, Raspberry Pi, MATLAB, etc. 

" DeviceHive: This provides Docker and Kubernetes 
deployment options. It has the power to connect 
to any device or hacker board via Rest APIs, 
WebSockets, MQTT, etc. 
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Thinger.io: This is an open source platform for IoT and 
provides scalable cloud infrastructure for connecting 
devices. It supports all types of boards like Arduino, 
ESP8266, Raspberry Pi and Intel Edison. 

Open source IoT hardware - development boards: 
Various hardware development boards like Arduino, 
Freeduino, Raspberry Pi, BBC MicroBit, Orange Pi, 
Pine A64 as well as operating systems like Raspbian, 
Kodi, etc, are based on open source. 


5. Open source and Big Data analytics 

IDC says that worldwide revenues for Big Data and 
business analytics will be up from US$ 130 billion in 

2016 to more than US$ 203 billion in 2020 with an annual 
growth rate of 11.7 per cent. Nowadays, most organisations 
understand the value of capturing all the data streaming 
inside the business and hence employ open source Big Data 
analytics to gain crucial advantage from it. 


Open source software and Big Data go hand in hand 


these days since today’s applications can handle diverse 
data in an effective manner, as it grows exponentially in 
variety, volume, velocity and veracity. 


The popular Big Data analytics tools and platforms are: 
Apache Hadoop: Low cost distributed computing 

for Big Data 

Grid Gain: Faster analysis of real-time data 
Cassandra: Manages huge databases 

Terrastore: Popular for scalability and elasticity 
KNIME: Best tool for performance management and 
data integration 

Rapidminer: For faster processing of data and to 
simplify predictive analysis 

Solr: Scalable and reliable tool for Big Data file transfer 
and aggregation 

Terracotta: Enables enterprise applications to store and 
manage Big Data in server memory 

AVRO: Data serialisation system based on JSON- 
defined schemas 

Oozie: Coordinates scheduling of Hadoop jobs 
Zookeeper: Centralised service for maintaining 
configuration information, naming, distributed 
synchronisation and group services 


6. Progressive Web Apps (PWAs) 

Progressive Web Apps (PWAs) bring a mobile-app like 
experience to end users without any app installation 
requirements. Designed by Google, these apps were 
promoted in the Google I/O 2017 conference. 


PWAs take advantage of the much larger Web 


ecosystem, plugins, community and the relative ease of 
deploying and maintaining a website when compared to 

a native application in the respective app stores. A PWA 
takes advantage of a mobile app’s characteristics, resulting 
in improved user retention and performance, without the 


complications involved in maintaining a mobile application. 

Progressive websites are rapidly growing in popularity — 
as a way to build apps with JavaScript, CSS and HTML, and 
they have a level of performance and usability that’s nearly 
identical to native apps. 

PWAs are able to work with most browsers and devices, 
fit in all screens with responsive designs, enable offline 
connectivity, and offer an app-like experience with features 
like push notifications and Web app manifest. 


Tools for building Progressive Web Apps 

= React: Managed and supported by Facebook and is the 
foundation for React Native. It can easily port apps built 
with React to native apps. 

= Polymer Template: Supported by Google to use the PRPL 
pattern to optimise delivery of the app to the device. 

= Webpack: Essential for complex and front-end driven 
progressive websites and comprises innumerable 
JavaScript applications. 

" Lighthouse: Powerful Google PWA performance 
monitoring tool to test load times and performance in page 
loads, as well as security in network connections, design 
and the user interface. 


7. Open source and cyber security 

When creating a security policy for any organisation, or 
when building a security operations or research centre, the 
prime requirement is to have the right people, processes and 
effective tools. 

The open source market is filled with lots of security 
tools and even Linux distributions like Kali Linux, the Parrot 
Security Tool Set, Network Security Tool Kit, Cyborg Hawk 
and many more. All these are used by security and penetration 
testing professionals for real-time security operations and to 
find vulnerabilities and backdoors in existing networks. 

Open source intelligence (OSINT) is a methodology for 
using open source tools to collect information from publicly 
available sources, carry out analysis on the data and take the right 
course of action. To prevent security attacks on a network, it is of 
utmost importance to understand the information being collected 
by the organisation and the software being used to collect it. 

As per the latest open source security analysis reports 
(Blackduck Software-www.blackducksoftware.com), 67 
per cent of the applications reviewed contain OS security 
vulnerabilities, with 40 per cent of open source vulnerabilities 
in each application being rated as ‘severe’. 

In the US, it is expected that the 2018 budget for the 
Pentagon and the US’ Department of Defense (DoD) will 
include the launch of a new pilot programme, for which 20 
per cent of the custom code developed will use open source 
software because as per Linus’s Law: “Open source software 
is more secure as compared to anything else.” GitHub hosts 
more than 70 million open source projects, of which 600,000 
components are downloaded over 14 billion times. 
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The following are the top operating systems used by hackers 
and penetration testers. These are under continuous development 
to make them impenetrable against real-time threats: 
= Kali Linux 
= Parrot Security OS 
=  BackBox Linux 
= Samurai Web Testing Framework 
= Deft Linux 
= CAINE 
= Network Security Toolkit (NST) 
= BlackArch Linux 
= ArchStrike Linux 
= Cyborg Hawk 
= Fedora Security Spin 
= BugTraq 
= Node Zero 
= Weakerthan 
= Dracos Linux 


8. Open Source and virtual reality, augmented 
reality and mixed reality 

Recent years have seen an increased interest in the 
implementation and usage of virtual reality, augmented 
reality and even mixed reality -- particularly in the areas of 
healthcare applications, military, fashion, sports, construction, 
media, telecommunications, films and entertainment, 
engineering and education. However, higher adoption is 
limited due to the high costs of developing such software, 
the lack of technical skills and the difficulties in the 
implementation environment. To overcome these hurdles, 
open source has come to the rescue. 

Open source has deep roots in this domain, and these will 
grow deeper and more advanced in the coming few years -- in 
terms of live implementations, SDKs and software. With the 
implementation of open source, a new reality called XR or 
Extended Reality has evolved, which is pretty advanced and 
has many live implementations. 

Virtual reality, augmented reality and mixed reality 
sales are expected to touch US$ 2.8 billion by 2018. Various 
companies like Facebook, Google, Microsoft, Magic Leap, 
HTC, Samsung, WorldViz, Unity, Snap and FirstHand 
Technologies have already started adopting open source based 
technologies for bringing out hardware and software cum 
SDKs supporting VR, AR and MR. 

The Open Source Virtual Reality (OSVR) Consortium 
and Xilinx are also working on coming out with various VR 
hardware and software for end users. 

The following are various toolkits, platforms and SDKs 
for VR, AT and MR: 
= OSVR - Open Source Virtual Reality for gaming 
= ARToolkit — An open source AR SDK 
= Apertus VR - An open source VR and AR engine 
= OpenSpace3D - An open source platform for 3D 

environments 
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= Holokit - An open source MR kit 

= Mixare - An open source AR engine 

" High Fidelity - An open source shared VR app 

= OpenMask - Open source middleware for VR 

= CalVR - An open source VR software framework 


10. Open source and cloud computing 

Cloud computing is one of the most significant 
technologies to have emerged in the last couple of years 
and perhaps, even the next couple of years, resulting in 
billions of dollars in investments. The cloud computing 
industry is expected to touch US$ 241 billion by 2020. 
Common open source cloud applications are CloudStack, 
OpenNebula or OpenERP servers. 

Innovations in cloud computing have led to remarkable 
trends like the increasing adoption of the public cloud; the 
adoption of container technologies like Docker, Kubernetes, 
Apache Mesos, LXD, etc; and the adoption of DevOps in 
application delivery and open source tooling by companies. 

Cloud computing and open source share common goals 
like minimising costs by not paying licence fees. It is 
generally accepted that without open source software, cloud 
computing would not have been able to grow as rapidly. 
Today most cloud computing vendors use open source 
software to develop their systems. Examples include Red Hat 
for cloud operating systems and infrastructure, Eucalyptus and 
OpenStack for Infrastructure-as-a-Service implementation, 
Cloudera for the open source Hadoop software framework, 
OpenNebula for the open source VM, the Xen Supervisor 
for server virtualisation management, Cloud Foundry and 
OpenShift for an open Platform-as-a-Service, etc. 

The following are some useful tools for cloud computing. 
= OpenStack: This facilitates data centres to offer the 

combined resources of computing, storage, networking 

and a GUI via a dashboard for effective management. 

= CloudStack: This deploys and manages large networks of 
virtual machines. 

" Eucalyptus: This facilitates easy migration of apps and 
data; it creates private and hybrid cloud environments. 

= openQRM: Enables the building of private, public and the 
hybrid IaaS. 

= OpenShift: Enables easy management of cloud based app 
development. 

Other open source cloud computing simulators include 
CloudSim, CloudAnalyst, iCanCloud, GreenCloud, 
CloudSched, etc. ENDL @ 
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The Top Twelve 


2017 has been a year in which open source software has flourished. This presentation 
offers a bird’s eye view of the past year’s best offerings in software. 


ne of the world’s most reputed technology research 

firms, Gartner, had predicted early in 2017, that the 

next decade would be spurred by a whirlwind of 
advances undergirded by a focus on ‘artificial intelligence 
everywhere’, ‘digital platforms’ and ‘transparency’. Little 
did we know that the prediction would ring true from the 
following month itself. 

It has been a landmark year for open source! We are 
on the cusp of a revolution, poised at the frontiers of the 
technological rebirth of our world, triggered by a spectrum of 
possibilities that are opening—from 5G to blockchain — all 
of which are influencing businesses to adopt a decentralised 
approach in an effort to keep up with the times. 

With the game changing results provided by deep learning 
algorithms, the tech industry is paying close attention to all 
the possible use cases. From Google and Uber to Baidu and 
Tesla, almost every major tech company has invested heavily 
in applying such practices in a bid to improve their offerings 
and introduce novel features for the consumer. 

The recent hype also sent cryptocurrency prices soaring, 
before these are plummeted back to reasonable levels. The 
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2017 Christmas season was peppered with initial coin 
offerings, as entrepreneurs went all out to harness positive 
public sentiments and channel these into VC funding. In 
fact, the largest players in the financial market have set 
up consortiums to look at how these technologies can 
be incorporated into their business models in the current 
climate. All this could lead to radical changes in the rather 
archaic and constrained institutions that have driven the 
fiscal aspects of our lives thus far. 

So, all in all, it’s been a highly transitive year 
as we move into a decade that will resonate with 
pervasive artificial general intelligence. We need to 
come to terms with a huge leap from the status quo into 
bleeding edge technology. 


Top open source projects of 2017 

A great year for open source projects, 2017 witnessed 

a number of major changes to existing projects and an 
assortment of new projects targeted at all aspects of the 
software development life cycle. Some of the most popular 
offerings have been highlighted in this article. The order is 
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Figure 1: 2016-2017 trends in open source adoption (Image courtesy: Rackspace) 


in no manner representative of relative importance. 


1. Hyper: A terminal built using HTML, CSS and JS 
The creators of Hyper had one basic goal in mind— to create 
a simple, efficient and, most importantly, hackable command 
line interface for developers. Intuitive shortcuts can be added 
in Hyper based on developer preferences, and there is the 
option of building your own plugins to provide additional 
functionality to this tiny but powerful application. 


2. Parse Server: The open source 

version of the Parse back-end 

This software can be deployed in any infrastructure that 

is capable of running Node.js, and works well with the 
Express Web Application Framework. It can be deployed 
independently, or added to existing Web applications. 
Supported by an array of tutorials and extensive 
documentation, this has proved to be a popular, community- 
backed project. 


3. TensorFlow: Google’s open source 

machine learning framework 

Originally released in 2015, the TensorFlow project saw 
major updates in 2017. These included support for Python 
generators, additions to various APIs for performance 
improvements and, most importantly, the addition of Keras 
to the TensorFlow core package. In addition, Tensorboard 
was released to help improve visualisation via plotting of 
quantitative metrics on a graph. TensorFlow is currently the 


most popular project on GitHub with over 77,000 stars. In 
comparison, the Linux project has just about 53,000 stars! 


4. Caffe2 and PyTorch: Updates to both open 
source machine learning frameworks 
Facebook-backed PyTorch was developed in response 

to Google’s release of TensorFlow. Facebook also uses 
Caffe for training deep learning models in production. 
An extremely interesting development in the recent 
releases of both these frameworks is due to Facebook 
and Microsoft’s partnership on releasing updates to these 


frameworks based on a new protocol defined as Open Neural 


Network Exchange (ONNX). ONNX is a standard for the 
representation of deep learning models that allows models 
to be transferred between frameworks. It is going to be 
interesting to monitor progress on this front. 


5. Bulma: A FOSS CSS framework 

based on Flexbox 

Available as a more intuitive alternative to popularly used 
Bootstrap, Bulma is essentially a single file— bulma.css; 
it does not include any JavaScript since people generally 
want to use their own JavaScript implementation. It can be 
considered ‘environment agnostic’ since it’s just the style 
layer on top of the logic. 


6. Anime: A lightweight, open source 

JS library tailored for animation 

This is a lightweight JavaScript animation library. Anime 
works with any of CSS’ properties, individual CSS 
transforms, SVG or any DOM attributes, as well as with 
JavaScript Objects. 


7. Yarn: A fast, reliable and secure 
dependency management software 

Introducing a number of unconventional and innovative 
features, Yarn presents a convenient alternative to the 
software in use. What it offers ranges from offline installs 
on account of caching, to increased reliability and security, 
arising from paradigms within the installation that offer 
network resilience and concurrency for performance. 
Yarn goes the distance for developers looking for more 
efficiency and less failures across systems resulting from 
mismanagement of dependencies. 


8. Apache Software: Multiple projects by the 
Apache Software Foundation 
From releases of Hadoop to Spark, the Apache Software 
Foundation has had a number of important releases for 
projects in the open source domain this year. 
CarbonData is a novel Big Data file format for interactive 
querying using advanced columnar storage. TinkerPop is 
another project that works behind the scenes in powering 
graph-style modelling, formulation and analysis for a number 
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that can scale massively, reliably, and focus on resilience 
instead of recovery. CockroachDB is serving global cloud 
needs for organisations like Baidu. 


10. Kubernetes: Container orchestration 

and management at scale 

For software that was adopted by a mere 10 per cent of 

& BULMA the industry in 2015, to rise to a phenomenal adoption rate 
of nearly 71 per cent of the market by 2017 (according to 
the same service provider) is an achievement. Originally 
a Google-backed project and now managed by the Cloud 


Native Computing Foundation, Kubernetes is only going to 
A PAC H E N ET ( ere see good times as cloud computing and containers take the 
/ 


tech industry by storm. 
Tensor Y. 
andex : 
11. XGBoost and CatBoost: Gradient boosting 
9 Cockroach ve CatBoost libraries for machine learning 
The advent of machine learning in 2017 saw the focus 
Figure 2: The top open source projects in 2017 shift to the practice of gradient boosting, often touted as 


‘steroids for learning algorithms’. While XGBoost works 
of popular frameworks including the likes of Spark, Titan and = on multi-language and multi-platform gradient boosting, 
Neo4j. Kudu, an integral component of most enterprise stacks | CatBoost offers categorical feature support for gradient 
already, is poised to redefine the way we approached data boosting on decision trees. 
storage and analysis using Hadoop and HBase, providing an 
optimal solution for large amounts of frequently updated data. 12. Microsoft .NET Core 2.0: A minimal 


MXNet was one of Amazon’s widely discussed deep subset of the .NET framework 
learning frameworks for cross-platform applications on Placed at the forefront of Microsoft’s foray into open source, 
the Web and the mobile, that was accepted by the Apache .NET Core 2.0 offers improved functionality to develop cross- 
Incubator and saw exciting progress through 2017. platform applications for the Universal Windows Platform as 
Some other projects with major updates released in 2017 well as Xamarin. It will be interesting to see how adoption 
by the Apache Software Foundation include Zeppelin, rates shape up for this project in the days to come. ENDL @ 


Solo, Arrow, Spark and Kafka. 


9. CockroachDB: A massively scalable, BycoWanveel Beta 


The author has worked with Microsoft Research, CERN and 
fault-tolerant SQL database : ae startups in Al and cyber security. An open source enthusiast, 
As the name states, the inspiration for this project is from he enjoys spending his time organising software development 
cockroaches, the only living creatures that scientists workshops for school and college students. You can connect 
believe will survive a potential nuclear war or ice age— the with him at https://www.linkedin.com/in/swapneelm and find 
database is modelled similarly to offer an SQL interface out more at https://github.com/SwapneelM. 
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Most of us use cloud services even though we are unaware that we are doing so. Many 
social media platforms, if not all, are cloud based. A PaaS (Platform-as-a-Service) offering 
is a third party model in which the service provider offers software and hardware tools, 
usually for application development. This article discusses a selection of cloud platforms. 


n the early days, only geeky end users played around with 
open source tools and technologies. Today, open source 
solutions are becoming the de facto standard for many large 
enterprises. Among the many open source tools, the cloud 
is one of the hot favourites, as it comes with many cost and 
operational (capex and opex) advantages. 

Many cloud based platforms evolved during 2009-2010 
and started incorporating many features that were offered ‘as 
a service’. Along with public and private cloud platforms, 
the hybrid cloud (a combination of the public and private 
cloud) offering has become popular, as it caters both to 
the privacy needs (the private cloud) of the organisation 
as well as the scalability options available when hosting 
in the public cloud. 

In this article, I will discuss six popular cloud based open 
source platforms and frameworks. 


WS02: A PaaS framework maintained by Apache 
In June 2013, WSO2 donated the PaaS framework, Stratos, to 
Apache. The main intention and goal was to foster a vendor- 
neutral open cloud community that supports a plethora of 
cloud environments and protects against cloud lock-in. 

From May 2014 onwards, this was heavily supported by 


the likes of SUSE, Cisco, Citrix, NASA, Sungard, etc. WSO2 
now provides WSO2 Private PaaS 4.0, the first comprehensive, 
enterprise-grade Platform-as-a-Service based on the Apache 
Stratos 4.0 PaaS framework, which was released in 2014. 

The framework supports the growth of cloud-enabled 
connected businesses that demand a PaaS capable of handling 
the disparate applications and tools used by companies, their 
customers and partners. The announcement was made in 
conjunction with WSO2Con Europe 2014. 

WSO2 in a nutshell: This is a Platform-as-a-Service 
(PaaS) framework from the Apache free source community. 

It provides: 

" The Elastic Scalability feature for any type of service, 
using the underlying infra cloud 

=" Managing, logging and metering of supported services 

= Foundation services for: 

¢ User management 

* Storage and billing 

WSO2 architecture explained: WSO2 or Stratos 
architecture has multiple components to it. As mentioned 
earlier, along with logging and metering, it has many 
services components that contain load balancers (which 
are nothing but service nodes). 
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Figure 1: WSO2 architecture (Source: Attps://wso2.com/platform) 


These service node components are connected to the 
‘real-time event bus’ and to jClouds. The latter supports many 
cloud platforms like OpenStack, vCloud, CloudStack and 
EC2. There are also components like Artifact for distribution 
co-ordination, an auto scaler based on rules, a CEP (complex 
event processor) and cloud controllers. 

If you observe closely, WSO2 is nothing but micro 
services based architecture that supports various services. 
Interaction between inner architecture and outer architecture 
components is made easy with messaging channels, cluster 
based micro services and the message bus. 

WSO2 has the advantage of fostering agility and 
flexibility with open source. This helps in recomposing 
code, or making changes and additions to meet specific 
requirements as and when they evolve. Integration of 
applications is also made easy. 

WSO? helps in building solutions in the following domains: 
= Financial sector (open banking) 
= Regulatory compliance (GDPR or General Data 

Protection Regulations) 

WSO2 features: These are listed below. 

API management: This API manager involves three 
components such as API Publisher, API Gateway and 
API Subscriber. The WSO2 API Manager is a 100 per 
cent open source enterprise-class solution that supports 
API publishing, life cycle management, application 
development, access control, rate limiting and analytics 
in one cleanly integrated system. This allows users 
to design, compose and publish the APIs and, hence, 

API discovery is made easy. 

Integration: This brings digital transformation into a 
single package for connecting enterprise systems and data. 
This feature helps in: 
= Optimising business processes 
= Reducing costs and potential bottlenecks 
= Leveraging the technology and cost savings of the cloud 
= Integrating legacy systems instead of total replacements 
= Building businesses that can adapt at speed 

to changing conditions 


IAM (identity and access management): This feature 
helps in securing digital business by connecting and managing 
multiple identities. [AM can be integrated with Azure or 
Salesforce based cloud solutions. [AM bridges multiple, single 
sign-on (SSO) protocols such as OpenID Connect, SAML 
2.0 and WS-Federation to provide a unified SSO experience. 
It provides strong authentication and enforces multi-factor 
authentication with SMS/email one-time passwords (OTPs), 
Fast Identity Online (FIDO), MePIN, Duo Security and more. 

The advantages of this feature are: 
= Allows easy access from anywhere 
= Connects everyone to everything 
= Improves productivity 
= Enhances the user experience 
= Single sign-on and identity federation 
" Identity based governance and administration 

Analytics: You can leverage streaming analytics to gain 
the business intelligence you need for digital transformation. 
WSO2 Data Analytics Server (WSO2 DAS) is a powerful 
open source analytics platform that analyses data streams in 
real-time. It offers streaming analytics capabilities, complex 
event processing and machine learning to help you understand 
events, map their impacts, identify patterns and react within 
milliseconds, in real-time. 

The advantages are that you can: 

" Monitor and analyse with operational analytics 

= Create better customer experiences 

= Make informed business decisions 

= Extend the solutions to analyse the past, present and future 

Why WS0O2 is enterprise grade: WSO2 provides 
many enterprise grade advantages and features that no other 
platform or framework does. These help in easy deployment 
and integration. 

The advantages of WSO2 are: 

" Centralised metering and monitoring with a unified 
logging framework 

« Ability to plug in any third party health checking/ 
monitoring framework 

"Cartridge model enables bringing in even legacy apps into 
the cloud as service nodes 

= Supports DevOps tooling 

« Elastic scaling (not only HTTP based services) 

"Cloud bursting (scales across multiple infra clouds 
simultaneously) 

*  Multi-zone/data centre support 

* Multiple tenant isolation levels (VM, LXC, Docker) 


HPE Helion Stackato, a hybrid cloud solution 
Stackato (which was a Cloud Foundry and Docker based PaaS) 
was acquired by HPE around July 2015. HPE Helion Stackato 
is a cloud native platform that provides enterprises the right 
services, tools and control to enable developers to accelerate 
their innovations. HPE Helion Stackato is a polyglot Platform- 
as-a-Service (PaaS) and also supports SaaS. This can be used 
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to deploy applications written in a wide range of languages and 
Web frameworks, using a variety of data services. 

Helion Stackato helps in the automatic configuration of 
language runtime, Web servers, application dependencies, 
databases and other services. Stackato can be run in any 
data centre using the hypervisor of choice or by using a 
supported cloud hosting provider. It provides IT operators 
with simplified deployment and cloud native services. It is 
infrastructure agnostic, guaranteeing compatibility across cloud 
infrastructures. With the Helion Service Manager, operators can 
easily manage application services while leveraging the Helion 
Control Plane to ensure the entire application life cycle. 

HPE Helion Stackato provides multi-cloud integration 
support for enterprise level developers by using the Web 
console. Using this, developers can create applications from 
any code repos (e.g., GitHub) or any of their enterprise 
grade repos, enable the full build, and then test and deploy 
the complete pipeline. The management of the application’s 
life cycle has thus been made very easy. As mentioned 
earlier, any of the popular development languages like Java, 
.NET, Python, Go, Ruby and Node.js can be chosen. Also, 
developers can build and use plugins for the existing popular 
IDEs like that of MS Visual Studio, Eclipse, etc. 

HPE Helion Stackato platform services: These are 
listed below. 

Helion Control Plane (HCP): This is a core platform service 
that HPE Helion Stackato uses to manage service life cycles and 
communicate with the underlying cloud provider (IaaS) layer. 

Helion Service Manager (HSM): This provides a repository 
of services that can be used by applications. 

Helion Cloud Foundry (HCF): This is a Cloud Foundry 
certified elastic runtime that simplifies cloud native 
application development and hosting. 

Helion Code Engine (HCE): This is a continuous delivery 
service that integrates with public or private Git repositories. 
HCE is a flexible and extensible continuous integration/ 
continuous development (CI/CD) pipeline. 

HPE Helion Stackato Console (HSC): This is a Web 
interface that’s used to manage HCF and HCE features. 

The advantages of using Stackato are: 
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Figure 2: HPE Helion Stackato architecture (Source: Attps.//docs.hpcloud.com/ 
stackato/stackato/planning/architecture.html 


"  Stackato provides an agile, robust cloud application platform 

= Tt automates the auto scaling of the Stackato cluster using 
the supported cloud dashboard 

" Working seamlessly together, Stackato and the supported 
cloud enable you to expand and reduce cloud resources 
based on user demand, thus reducing the setup time 

"  Stackato helps in improving the productivity of cloud 
administrators and developers 
Stackato is well suited for enterprise cloud application 

deployments, and also helps in the application model when 

compared to IaaS or cloud orchestration software. It combines 

the flexibility offered by direct VM access on IaaS with the 

highly automated configuration provided by PaaS. Computing 

resources are shared efficiently and securely by giving each 

application its own Linux container (using LXC), which can 

be extensively customised to suit the application it is hosting. 
Stackato supports the following: 

= Citrix XenServer 

=  OpenStack 


= KVM 

= VMware 

= AWS 

= HP Cloud services 
= Dell 

Cloudity 


Cloudify was developed and designed on the principles of 
openness to power the IT transformation revolution. It enables 
organisations to design, build and deliver various business 
applications and network services. The latest version of 
Cloudify is 4.2 and it incorporates enhanced features like 
advanced security, control and true self-service. 

Cloudify has very good orchestration support for NFV 
(network function virtualisation). It’s TOSCA (Topology 
and Orchestration Specification for Cloud Applications) 
based, open and pluggable architecture provides end-to-end 
management and orchestration (MANO) of the NFV lifecycle. 

This enables telecom based organisations and operators 
to build best-of-breed NFV stacks and reduce the overhead 
costs considerably. 

Cloudify 4.2 introduced a totally new concept for 
container orchestration with Kubernetes (Kubernetes Cloud 
Native Orchestration). Now onwards, Cloudify will offer 
Kubernetes Provider. This will allow you to utilise Cloudify 
as a Cloud provider; so you can easily deploy a cluster as 
well as scale and auto scale the number of nodes natively, 
configure networking and load balancing, and have storage 
and compute customisation as well as native multi-cloud 
support, as and when required. 

Figure 3 shows the detailed architecture of Cloudify. 

Cloudify also enables ARIA TOSCA integration. In 
order to meet the complete standards requirements, TOSCA 
Simple Profile 1.0 is now supported using the ARIA plugin. 
The plugin allows orchestrating TOSCA CSAR packages 
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Figure 3: Cloudify architecture (Source: Attp://cloudify.co/quide/3.0/ 
overview-architecture.Atml 


by introducing a new ARIA node type for Cloudify, which 
exposes Project ARIA’s capabilities to Cloudify. 


=] Note: ARIA (Agile Reference Implementation 
of Automation) is a vendor neutral and technology 
independent implementation of the OASIS TOSCA 


specification. ARIA offers a command line interface (CLI) 
to develop and execute TOSCA templates, and an easily 
consumable software development kit (SDK) for building 
TOSCA enabled software. 


Cloudify Composer and Cloudify UI are the two main 
frameworks that support integration and unified login; both 
are interdependent and allow many enhancements. 

Cloudify’s well-known features are: 
= Cloudify supports cloud portability and frees businesses 

from vendor lock-in, ensuring more flexibility 
= It provides a native cloud experience 
= It helps in faster app rollout and life cycle automation, 

shortening the deployment time from days to minutes 
= Helps in maximum cloud performance with streamlined 
processes, enhancing manageability and minimising error 
= Helps in complete control, visibility, app-centric 

activity monitoring 
=  Cloudify is very cost-efficient and helps to 

optimise cloud usage 
= It helps in application modelling (describes an application 

with all its resources) 

Apart from the above list, orchestration in Cloudify 
enables maintaining and running identified applications. 
Pluggability, which is one of the core, unique features of 
Cloudify, provides reusable components (SDN components, 
NFV components, and so on) as well as abstraction for the 
system. And Cloudify’s security features control who has 
permissions to use it to execute various operations. 


Cloud Foundry 

Cloud Foundry is an open Platform-as-a-Service (PaaS) 
which provides a choice of clouds, developer frameworks and 
application services. Cloud Foundry makes it faster and easier 


to build, test, deploy and scale applications. 
Its features are: 
= Application- and services-centric life cycle API 
" High performance and dynamic routing 
= Data and Web services brokers for cloud brokering 
= Linux container management (LXC) 
= Features like role based access (RBAC) and teams work 
well with standards based user 
authentication and authorisation 
"Active application health monitoring and management 
" Integrated real-time logging API 
*  Multi-provider ecosystem 


OpenShift 

OpenShift is Red Hat’s cloud computing PaaS offering. It is 

an application platform in the cloud where app developers and 

teams can build, test, deploy and run their applications. 
Its features are: 

* Built-in support for Node.js, Ruby, Python, PHP, Perl and 
Java (the standard in today’s enterprise world) 

=  OpenShift provides customisable cartridge functionality 
for extensions that allow developers to add any other 
language they wish 

"It supports frameworks ranging from Spring 
and Rails, to Play, etc 

«Auto scaling is one of the prominent features of 
OpenShift; this helps in scaling of applications by adding 
additional instances 

= OpenShift by Red Hat is built on open source 
technologies (Red Hat Enterprise Linux- RHEL) 

"It provides one-click deployment 


Tsuru: a PaaS from Globo.com 

Tsuru is an open source PaaS that was started around 
January 2012. It supports Mongo, MySQL, Elastic Search, 
Varnish, Redis, Memcached, Cassandra, etc. Tsuru also 
supports Go, PHP, Static, Node.js, Java, Python and Ruby 
platforms for development. 

Tsuru can be easily and directly deployed from Git 
repositories. Scaling up of applications using Tsuru is easy, 
while its integration with Docker and Kubernetes is also 
picking up very fast. 

Its features are: 
= Simple architecture 
" Resilience 
= Easy customisation 
Completely open source 
= Zero downtime [il 
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Top 10 Open Source Tools for 
Linux Systems Administrators 


Linux systems administrators need a number of tools to keep their systems well-oiled 
and running smoothly at peak efficiency levels. Here is a set of ten tools which will help 
them, whether they are newbies or veterans getting a refresher course. 


systems administrator’s job is complex, covering 
Assen that range from managing systems 

to intrusion detection. Thankfully, the world of 
open source software provides a comprehensive set of tools 


to simplify admin tasks. The following list of ten key open 
source tools covers all the bases. 


1. Cockpit 

Cockpit is software developed by Red Hat that provides 
an interactive browser based Linux administration 
interface. Its graphical interface allows beginner system 
administrators to perform common sysadmin tasks 
without the requisite skills on the command line. In 
addition to making systems easier to manage for novice 
administrators, Cockpit also makes systems configuration 
and performance data accessible to them, even if they 


do not know command line tools. It is available via the 
Cockpit package in the Red Hat Linux 7 extras repository. 
You can install Cockpit using the following command: 


# yum -y install cockpit. - 


Once installed on a system, Cockpit must be started 
before it can be accessed across the network, as shown below. 
Cockpit can be accessed remotely via HTTPS using a Web 
browser and by connecting to TCP Port 9090. This port must 
be opened on the system firewall for Cockpit to be accessed 
remotely. It is defined as the Cockpit for firewalls. 


# systemctl start cockpit 
# firewall-cmd --add-service=cockpit --permanent 
# firewall-cmd --reload 
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Figure 1: Package installation and starting the service 


Once the connection is established with the Cockpit 
Web interface, a user must be authenticated in order to 
gain entry. Authentication is performed using the system’s 
local OS account database. The dashboard screen in the 
Cockpit interface provides an overview of the core system’s 
performance metrics. Metrics are reported on a per second 
basis, and allow the administrator to monitor the use of 
subsystems such as the CPU, memory, network and disk. 


2. PCP 

Red Hat Enterprise Linux 7 includes a program called 
Performance Co-Pilot, provided by the PCP RPM package. 
Performance Co-Pilot, or PCP, allows the administrator to collect 
and query data from various subsystems. It is installed with the 
pcp package. After installation, the machine will get the pmcd 
daemon, which is necessary for collecting the subsystem data. 
Additionally, the machine will also have various command line 
tools for querying system performance data. 

There are several services that are part of PCP but the 
one that collects systems performance data locally is pmcd, 
or the Performance Metrics Collector Daemon. This service 
must be running in order to query performance data with 
the CP command line utilities. The pcp package provides a 
variety of command line utilities to gather and display data 
on the machine. 

The pmstat command provides information similar to 
vmstat. pmstat supports options to adjust the interval between 
collections (-t) or the number of samples (-s). 

The pmatop command provides a top like output 
of machine statistics and data. It includes disk I/O and 
network I/O statistics, as well as CPU memory and process 
information provided by other tools. By default, pmatop will 
update every 5 seconds. 

The pmval command is used to obtain historical statistics 
of per CPU idle time at one minute intervals from the most 
recent archive log. 


3. Puppet 

Puppet allows the systems administrator to write 
infrastructure as code using a descriptive language to 
configure machines, instead of using individualised and 
customised scripts to do so. Puppet’s domain-specific 
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Figure 2: Output of pmstat and pmva/ commands 


language is used to describe the state of a machine, and 
Puppet can enforce this state. This means that if the 
administrator mistakenly changes something on the machine, 
Puppet can enforce the state and return the machine to the 
desired state. Thus, not only can the Puppet code be used to 
configure a system initially, but it can also be used to keep the 
state of the system in line with the desired configuration. 
Puppet architecture: Puppet uses a server/client model. 
The server is called a Puppet master and it stores recipes and 
manifests for the clients. The clients are called Puppet nodes 
and run the Puppet agent software. These nodes normally run 
a Puppet daemon that is used to connect to the Puppet master. 
The nodes will download the recipe assigned to the node from 
the Puppet master and apply the configuration if needed. 
Configuring a Puppet client: Although Puppet can 
run in standalone mode, where all Puppet clients have 
Puppet modules locally that are applied to the system, most 
systems administrators find that this tool works best using a 
centralised Puppet master. The first step in deploying a Puppet 
client is to install the Puppet package. 


# yum -y install puppet 


Once the Puppet package is installed, the Puppet client 
must be configured with the host name of the Puppet master. 
The host name of the Puppet master should be placed in the 
/etc/puppet/puppet.conf file under the [agent] section as 
shown below. First open the configuration file of Puppet, and 
then make entries as shown, before saving the file. 


# vim /etc/puppet/puppet . conf 
[agent ] 


Server=puppet .demo.example.com 


The final step to be taken on the Puppet client is to start 
the Puppet agent service and configure it to run at boot time. 


# systemctl start puppet.service. 
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# systemctl enable puppet.service. 


4. AIDE 


System stability is put at risk when configuration files 
are deleted or modified without authorisation or careful 
supervision. How can a change to an important file or 
directory be detected? This problem can be solved by using 
intrusion detection software to monitor files for changes. 
Advance Intrusion Detection Environment or AIDE can be 
configured to monitor files for a variety of changes including 
permissions or ownership changes, timestamp changes, or 
content changes. 

To get started with AIDE, install the RPM package 
that provides the AIDE software. This package has useful 
documentation on tuning the software to monitor the specific 
changes of interest. 


# yum -y install aide. 


Once the software is installed, it needs to be configured. 
The /etc/aide.conf file is the primary configuration file 
for AIDE. It has three types of configuration directives: 
configuration lines, selection lines and macro lines. 

Configuration lines take the form param=value. When 
param is not a built-in AIDE setting, it is a group definition 
that lists which changes to look for. For example, the 
following group definition can be found in /etc/aide.conf, 
which is installed by default: 


PERMS = ptitu+gtacl+selinux 


The aforesaid line defines a group called PERMS that 
looks for changes in file permissions (p), inode (i), user 
ownership (u), group ownership (g), ACLs (acl), or SELinux 
context (selinux). We can write our own parameters in the file 
which can be used to monitor our systems. 

Selection lines define which checks are performed on 
matched directories. 
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Figure 3: The configuration file of AIDE /etc/aide.conf 


Figure 4: CPUs running in a machine can be identified by the /scpu command 
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Figure 5: The dmidecode command in use 


The third type of directive is macro lines — these define 
variables and their definition has the following syntax. 


@@define VAR value 


Execute the aide --init command to initialise the AIDE 
database. After the database is created, a file system check 
can be performed using the aide --check command. This 
command scans the file system and compares the current state 
of files with the information in the earlier AIDE database. Any 
differences that are found will be displayed, showing both the 
original file’s state and the new condition. 


5. Mcelog 

An important step in troubleshooting potential hardware 
issues is knowing exactly which hardware is present in the 
system. The CPU(s) in a running system can be identified 
from the Iscpu command, as shown in Figure 4. 

The dmidecode tool can be used to retrieve information 
about physical memory banks, including the type, speed and 
location of the bank as shown in Figure 5. 

Modern systems can typically keep a watch on various 
hardware failures, alerting an administrator when a hardware 
fault occurs. While some of these solutions are vendor-specific 
and require a remote management card, others can be read 


www.OpenSourceForU.com | OPEN SOURCE FOR YOU | FEBRUARY 2018 | 55 


TD overview 


from the OS ina standard fashion. RHEL 7 provides mcelog 
for logging hardware faults, which provides a framework 
for catching and logging machine check exceptions on x86 
systems. On supported systems, it can also automatically mark 
bad areas of RAM so that they will not be used. 

Install and enable mcelog as shown below: 


# yum -y install mcelog. 
# systemctl enable mcelog. 
#systemctl start mcelog. 


From now on, hardware errors caught by the mcelog 
daemon will show up in the system journal. Messages can 
be queried using the journalctl —u mcelog service. If the 
abort daemon is installed and active, it will also trigger on 
various mcelog messages. Alternatively, for administrators 
who do not wish to run a separate service, a cron is set up 
but commented out in /etc/cron.hourly/mcelog.cron that will 
dump events into /var/log/mcelog. 


6. Memtest86 + 


When a physical memory error is suspected, an administrator 
might want to run an exhaustive memory test. In such 
cases, the Memtest86+ package must be installed. Since 
the memory test in a live system is more than ideal, the 
Memtest86+ package will install a separate boot entry that 
runs Memtest86+ instead of the regular Linux kernel. The 
following steps outline how to enable this in the boot entry: 
1.) Install the Memtest86+ package and this will install the 
Memtest86+ application into /boot. 
2.) Run the command memtest-setup. This will add a new 
template into /etc/grub.d to enable Memtest86+. 


3.) Update the Grub2 boot loader configuration as shown below: 


# grub2-mkconfig -o /boot/grub2/grub.cfg. 


7. Nmap 

Nmap is an open source port scanner that is provided by 
the Red Hat Enterprise Linux 7 distribution. It is a tool that 
administrators use to rapidly scan large networks but it can 
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Figure 6: Nmap scanning ports of some other machine 


also do a more intensive port scan on individual hosts. Nmap 

uses the raw IP package in novel ways to determine what 

hosts are available on the network, what services those hosts 

are offering, what OS they are running, what type of packet 

filters/firewalls are in use and dozens of other characteristics. 
The nmap package provides the nmap executable. 


# yum -y install nmap 


The example given in Figure 6 shows Nmap scanning the 
network. The —n option instructs it to display host information 
numerically, not using DNS. As Nmap discovers each host, 
it scans privileged TCP ports looking for services. It displays 
the MAC address, with the corresponding network adapter 
manufacturer of each host. 


8. Wireshark 


Wireshark is an open source, graphical application for 
capturing, filtering and inspecting network packets. It was 
formerly called Ethereal, but because of trademark issues, 
the project’s name was changed. Wireshark can perform 
promiscuous packet sniffing when network interface 
controllers support it. RHEL 7 includes the wireshark-gnome 
package. This provides Wireshark functionality on a system 
installed with X. 


#yum -y install wireshark-gnome. 


Once Wireshark is installed, it can be launched by 
selecting Applications > Internet > Wireshark Network 
Analyzer from the GNOME desktop. It can also be launched 
directly from the shell using the following command: 


# wireshark 


Wireshark can capture network packets. It must be 
executed by the root user to capture packets, because direct 
access to the network interface requires root privileges. The 
capture option in the top level menu as shown in Figure 7 
permits the user to start and prevent Wireshark from capturing 


a e The World's Most Popular Network Protocol Analyzer 
WIRESHARK 
caper 
° interface List m Open Website 
User's Guide 
g Start paktmees 


—sen  @ Sample Captures 
- @ Security 


@ Capture Options 


Figure 7: Graphical user interface of Wireshark 
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packets. It also allows the administrator to select the interface 
to capture packets on. The ‘any’ option in the interface list 
matches all of the network interfaces. 

Once packet capturing has been stopped, the captured 
network packets can be written to a file for sharing or later 
analysis. The File>Save>... or File > Save as... menu item 
allows the user to specify the files to save the packet into. 
Wireshark supports a variety of file formats. 


9. Kdump 

RHEL offers the Kdump software for the capture of keel crash 
dumps. This software works by using the kexec utility on a 
running system to boot a secondary Linux kernel without going 
through a system reset. The Kdump software is installed by default 
in RHEL 7 through the installation of the kexec-tools package. The 
package provides the files and command line utilities necessary for 
administering Kdump from the command line. 


# yum -y install kexec-tools system-config-kdump. 


The Kdump crash dump mechanism is provided through 
the kdump service. Administrators interested in enabling the 
collection of kernel crash dumps on their systems must ensure 
that the kdump service is enabled and started on each system. 


# systemctl enable kdump. 
# systemctl start kdump. 


With the kdump service enabled and started, kernel 
crash dumps will begin to be generated during system hangs 
and crashes. The behaviour of the kernel crash dump and 
collection can be modified in various ways by using the /etc/ 
kdump.conf configuration file. 

By default, Kdump captures crash dumps locally to crash 


dump files located in subdirectories under the /var/crash path. 


10. SystemTap 
The SystemTap framework allows easy probing and 
instrumentation of almost any component within the kernel. 
It provides administrators with a flexible scripting language 
and library by leveraging the kprobes facility within the 
Linux kernel. Using kprobes, kernel programmers can attach 
instrumentation code to the beginning or end of any kernel 
function. SystemTap scripts specify where to attach probes 
and what data to collect when the probe executes. 

SystemTap requires symbolic naming for instructions within 
the kernel. So it depends on the following packages, which are 
not usually found on production systems. These packages will 
pull in any required dependencies. The packages are: 
1) Kemel-debuginfo 
2) Kernel-devel 
3) Systemtap 

Using stap to run SystemTap scripts: The SystemTap 
package provides a variety of sample scripts that administrators 
may find useful for gathering data on their systems. The scripts 
are stored in /usr/share/doc/systemtap-client-*/examples. These 
scripts are further divided into several different subdirectories 
based on what type of information they have been asked to 
collect. SystemTap scripts have an extension of .stp. 

To compile and run these example scripts, or any other 
SystemTap script for that matter, administrators use the 
stap command. Em} 
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OSFY Magazine Attractions During 2017-18 


MONTH THEME 

March 2017 Open Source Firewall, Network security and Monitoring 
April 2017 Databases management and Optimisation 

May 2017 Open Source Programming (Languages and tools) 

June 2017 Open Source and loT 

July 2017 Mobile App Development and Optimisation 

August 2017 Docker and Containers 

September 2017 Web and desktop app Development 

October 2017 Artificial Intelligence, Deep learning and Machine Learning 
November 2017 Open Source on Windows 

December 2017 BigData, Hadoop, PaaS, SaaS, laas and Cloud 

January 2018 Data Security, Storage and Backup 

February 2018 Best in the world of Open Source (Tools and Services) 
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ets of huge volumes of complex data that cannot 
be processed using traditional data processing 
software are termed Big Data. The functions 
of Big Data include privacy, data storage, capturing 
data, data analysis, searching, sharing, visualisation, 
querying, updating, transfers, and information security. 
There are many Big Data techniques that can be 
used to store data, to perform tasks faster, to make the 
system parallel, to increase the speed of processing 
and to analyse the data. There are also a number of 
distributed computation systems that can process Big 
Data in real-time or near real-time. 
A brief description of the five best Apache Big Data 
frameworks follows. 


Apache Hadoop 

Apache Hadoop is an open source, scalable and fault 

tolerant framework written in Java. It is a processing 

framework that exclusively provides batch processing, 

and efficiently processes large volumes of data on a 

cluster of commodity hardware. Hadoop is not only a 

storage system but is a platform for storing large volumes 

of data as well as for processing. 

Modern versions of Hadoop are composed of 
several components or layers that work together to 
process batch data. These are listed below. 

" HDFS (Hadoop Distributed File System): This is 
the distributed file system layer that coordinates 
storage and replication across the cluster nodes. 
HDFS ensures that data remains available in spite 
of inevitable host failures. It is used as the source of 
data, to store intermediate processing results, and to 
persist the final calculated results. 

" YARN: This stands for Yet Another Resource 
Negotiator. It is the cluster coordinating 
component of the Hadoop stack, and is responsible 
for coordinating and managing the underlying 
resources and scheduling jobs that need to be run. 
YARN makes it possible to run many more diverse 
workloads on a Hadoop cluster than was possible 
in earlier iterations by acting as an interface to the 
cluster resources. 

= MapReduce: This is Hadoop’s native batch 
processing engine. 


Apache Storm 

Apache Storm is a stream processing framework that 
focuses on extremely low latency and is perhaps the 
best option for workloads that require near real-time 
processing. It can handle very large quantities of 
data and deliver results with less latency than other 
solutions. Storm is simple, can be used with any 
programming language, and is also a lot of fun. 


Storm has many use cases: real-time analytics, online 
machine learning, continuous computation, distributed 
RPC, ETL, and more. It is fast—a benchmark clocked it at 
over a million tuples processed per second per node. It is 
also scalable, fault-tolerant, guarantees your data will be 
processed, and is easy to set up and operate. 


Apache Samza 

Apache Samza is a stream processing framework that is 
tightly tied to the Apache Kafka messaging system. While 
Kafka can be used by many stream processing systems, 
Samza is designed specifically to take advantage of Kafka’s 
unique architecture and guarantees. It uses Kafka to provide 
fault tolerance, buffering and state storage. 

Samza uses YARN for resource negotiation. This means 
that, by default, a Hadoop cluster is required (at least HDFS 
and YARN). It also means that Samza can rely on the rich 
features built into YARN. 


Apache Spark 

Apache Spark is a general purpose and lightning fast cluster 
computing system. It provides high-level APIs like Java, Scala, 
Python and R, and is a tool for running Spark applications. It 
is 100 times faster than Big Data Hadoop and ten times faster 
than accessing data from the disk. It can be integrated with 
Hadoop and can process existing Hadoop HDFS data. 

Apache Spark is a next generation batch processing 
framework with stream processing capabilities. Built using 
many of the same principles of Hadoop’s MapReduce engine, 
Spark focuses primarily on speeding up batch processing 
workloads by offering full in-memory computation and 
processing optimisation. 
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Figure 1: Big Data frameworks 
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Table 1: A comparison of the best Big Data frameworks 


a 


Data pro- 
cessing 


It is a batch processing 
system. 


It is a batch processing 
and stream processing 
system. 


It is a batch processing as 
well as stream processing 
system. 


It supports stream pro- 
cessing. 


It supports stream 
processing. 


2 Streaming 
engine 


MapReduce is a batch- 
oriented processing tool. 

It takes large data sets in 
the input, processes it and 
produces the results. 


Apache Spark streaming 
processes data streams 
in micro-batches. 


Apache Flink is the true 
streaming engine. It uses 
streams for workloads: 
SQL, micro-batch, and 
batch. 


Storm processes events 
one by one. 


Samza processes 
events one by one. 


3 Data flow 


MapReduce computation 
data flow does not have 
any loops. It is a chain of 
stages. 


Spark represents it as 
a direct acyclic graph 
(DAG). 


It supports controlled 
cyclic dependency graphs 
in runtime. 


Storm’s topology is 
designed as a directed 
acyclic graph (DAG) with 
spouts, bolts and streams 
used to process data. 


Samza relies on 
Kafka’s semantics 
to define the way 
that streams are 
handled. 


4 Fault toler- 


MapReduce is highly fault- 


Apache Spark streaming 


The fault tolerance is 


When workers die, Storm 


Whenever a ma- 


vendors have enabled or- 
ganisations to leverage Ac- 
tive Directory Kerberos and 
LDAP for authentication. 


shared secret (password 
authentication). The 
security bonus is that 

if you run Spark on 
HDFS, it can use HDFS 
ACLs and file-level 
permissions. Addition- 
ally, Spark can run on 
YARN to use Kerberos 
authentication. 


Flink on YARN, Flink 
acquires the Kerberos to- 
kens of the user that sub- 
mits programs, and au- 
thenticates itself at YARN, 
HDFS and HBase with 
that. Flink’s upcoming 
connector and streaming 
programs can authen- 
ticate themselves as 
stream brokers via SSL. 


ance tolerant. There is no need recovers lost work and based on Lightweight will automatically restart chine in the cluster 
to restart the application with no extra code or Distributed Snapshots. them. If a node dies, the fails, Samza works 
from scratch in case of any configuration, it delivers worker will be restarted on | with YARN to 
failure in Hadoop. exactly-once semantics another node. transparently mi- 
out-of-the-box. grate your tasks to 
another machine. 

5 Scalability MapReduce has incredible It is highly scalable. Apache Flink is also Scalable, with parallel cal- | Samza is 
scalability potential and has | We can keep adding n highly scalable. We can culations that run across a | partitioned and 
been used in the produc- number of nodes in the keep adding ‘n’ number cluster of machines. distributed at 
tion of tens of thousands cluster. A large known of nodes in the cluster; a every level. Kafka 
of nodes. Spark cluster is of 8000 large known Flink cluster provides ordered, 

nodes. is of thousands of nodes. partitioned, fault 
tolerant streams. 
YARN provides a 
distributed environ- 
ment for Samza. 

6 Language It supports Java, C, C++, It supports Java, Scala, It supports Java, Scala, Storm is written in Java Samza is written in 

support Ruby, Groovy, Perl and Python and R. Python and R. and Clojure but has good Java and Scala. It 
Python. support for non-JVM also supports JVM 
languages. languages. 

7 Latency Hadoop has higher latency It is relatively faster than With small efforts in con- Focuses on extremely low | Offers low latency 
than both Spark and Flink. Hadoop, hence offers figuration, Apache Flink’s latency. performance. 

lower latency than data streaming runtime 
Hadoop. achieves low latency and 
high throughput. 

8 Cost MapReduce can typi- As Spark requires a Apache Flink also requires | Apache Storm also re- Apache Samza 
cally run on less expensive lot of RAM to run in- a lot of RAM to run quires a lot of RAM to run also requires a 
hardware than some alter- memory, increasing it in-memory, so the cost in-memory, so the cost lot of RAM to run 
natives since it does not in the cluster gradually of using it will increase of using it will increase in-memory, so the 
attempt to store everything increases its cost. gradually. gradually. cost of using it will 
in memory. increase gradually. 

9 Security It supports Kerberos Apache Spark's security | There is user-authentica- Storm cluster authenti- Samza uses 
authentication, which is a bit sparse since it tion support in Flink via cates users via Kerberos. Apache Hadoop 
is somewhat painful to currently only supports the Hadoop/ Kerberos YARN to provide 
manage. But third party authentication via infrastructure. If you run security. 


10 | Scheduler 


The scheduler in Hadoop 
becomes the pluggable 
component. There are two 
schedulers: Fair Scheduler 
and Capacity Scheduler. To 
schedule complex flows, 
MapReduce needs an 
external job scheduler like 
Oozie. 


Due to in-memory com- 
putation, Spark acts as 
its own flow scheduler. 


Flink can use the YARN 
scheduler but Flink also 
has its own scheduler. 


Storm now has four kinds 
of built-in schedulers: De- 
fault Scheduler, Isolation 
Scheduler, Multi-tenant 
Scheduler and Resource 
Aware Scheduler. 


Samza uses the 
YARN scheduler. 
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LInUXxX 


commana 
systems 


Administrators 


NU/Linux is one of the most popular operating systems 

for servers. Today, most of the operating systems use 

advanced and modern graphical user interfaces (GUIs) 
but the CLI (command line interface) is still popular. Using the 
CLI/scripts, you can automate complicated tasks and execute 
them in a repetitive manner. In this article, we will discuss the 
most common CLI utilities. If you are familiar with GNU/Linux 
and want to become more productive, then this article is for you. 


1) find 

Searching for a file in a file system is a very common task and 
we have to do it quite often every day. GNU/Linux provides 
the find command which searches for files in a directory 
hierarchy. Given below is the syntax of the find command: 


find [STARTING-POINT] [EXPRESSION] 


In the above example: 

¢ STARTING-POINT represents the directory’s location, 
from where the search will start. Note that if STARTING- 
POINT is omitted, then the search will begin from the 
current directory. 

* EXPRESSION is evaluated to search the file. 
EXPRESSION can be a name, type, size, permission, 
owner, and so on. 

Let us search for a file with a given name. Here, we’ll be 


Systems administrators need a bag of tricks to ensure that 

everything runs smoothly without any hitches. Linux has a 

fine set of utilities and commands to assist sysadmins in their 

task. Mastering these tools will take the efficiency levels of 
Linux admins to a whole new level. 


using the file name as an EXPRESSION. 
$ find src-dir -name hello. txt 


In the above example, src-dir is the STARTING-POINT 
and hello.txt is the EXPRESSION. Here the -name option 
indicates that we are searching for the file by name. If you 
want to perform a case-insensitive search operation, then use 
the -iname option instead. 

For the find command, EXPRESSION can be a pattern as 
well. Many times we want to search certain types of files like 
.txt, jpg, .mp3 and so on. The example below shows how to 
use a pattern in EXPRESSION: 


$ find src-dir -name “*.txt” 


The above example will search and list all .txt files 
recursively from src-dir. 

We can use the file type as an EXPRESSION with the find 
command. For instance, use the command below to search 
only directories: 


$ find src-dir -type d 


In the above example: 
¢ The type option indicates we are performing a search 
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based on the file type 
¢ Argument d indicates the directory file type 

In addition to the directory, the find command supports the 
following file types: 


block device 
character device 
regular file 
symbolic links 
named pipe 
socket 


. 
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The find command allows you to perform a search based 
on the file size. We can provide an EXPRESSION which 
will compare files that are greater than, less than or equal 
to the provided size. We can also perform a search based 
on permissions, for which we have to use the -perm option. 
For instance, the command below searches for files with 
the 664 permission: 


$ find src-dir -type f -perm 664 


The find command allows you to perform some additional 
operations while searching. For instance, it provides the 
-delete option, which will remove a file that matches with 
EXPRESSION. The following example shows the usage of the 
-delete option: 


$ find src-dir -type f -size +2k -delete 


In the above example, all files greater than 2KB in size 
will be deleted. 

We can also execute bash commands while performing a 
search. We can achieve this using the -exec option. 


2) diff 
We often need to compare the contents of files. Doing this 
manually is a tiresome and error-prone task. But fortunately, 
GNU/Linux provides a command for this, which will 
compare files, line by line, and report if any differences are 
found. When the diff command is combined with the patch 
command, it makes a powerful combination. With this 
command, we can apply changes from one file to the other. 
This section describes both these commands. 

First, create two files with the following content: 


# file-1: 
stri 
str2 
str3 


versioni. txt 


# file-2: version2.txt 
Str 
str3 


Now compare these files using the -u option, which stands 
for unified diff: 


$ diff -u version1.txt version2.txt 

- versioni.txt 2017-12-30 14:06:38.120849370 +0530 
+++ version2.txt 2017-12-30 14:06:46.976750148 +0530 
@@ -1,3 +1,2 @@ 
stri 
-str2 


str3 


The above output shows that the line ‘str2’ is not present 
in the version2. txt file. We can store this diff output in a file 
and apply it as a patch. To create a patch file, just redirect the 
output to some file as shown below: 


$ diff -u versioni.txt version2.txt > diff.patch 


If we apply this patch to the version1.txt file, then it will 
remove the ‘str2’ line from this file. The example below 
shows this: 


$ patch -p1 versioni.txt < diff.patch 
patching file version1. txt 


$ diff -u version1.txt version2.txt 


After applying the patch, both files will be identical; 
hence, the diff command does not show any differences here. 

To revert the patch, execute the commands given below 
and follow the on-screen instructions shown below: 


$ patch -p1 versioni.txt < diff.patch 
patching file version1. txt 
Reversed (or previously applied) patch detected! 


[n] y 


Assume -R? 


The combination of diff and patch is really powerful. 
Many version control systems like Git, Subversion and CVS 
use this feature. 


3) rename 

Renaming multiple files is one of the common tasks of a 
sysadmin. GNU/Linux provides the rename command which 
will serve our purpose. It is particularly useful when we want 
to rename multiple files with a specific pattern. For instance, 
the command below renames all . TXT files to .txt: 


$ rename ‘s|.TXT|.txt|’ * 


Storing similar types of files in a directory is also a very 
common task. We can do it very easily with a combination 
of find and the mv command. The command below moves all 
MP3 files to a target-dir directory: 
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$ find src-dir -type f -name “*.mp3” -exec mv {} target-dir \; 


4) tar 

Sometimes it is convenient to operate on a single file 

rather than multiple files and here, the tar command comes 
into the picture. tar is a short form for ‘tape archive’. As 

the name suggests, it is an archiving utility that stores 
multiple files into a single one. Given below is the syntax of 
the tar command: 


tar [OPTIONS] [TAR NAME] [FILES TO BE INCLUDED IN TAR] 


To create a tar bundle, execute the command given below 
in a terminal: 


$ tar cvf archive.tar 1.txt 2.txt 3.txt 


In the above example: 
* c option stands for create archive 
* v option stands for verbose mode 
« foption stands for file names mentioned for archive 

The tar command allows us to manipulate tar bundles 
without recreating them again. For instance, to add a new file 
into an archive, use the -r option as shown below: 


$ tar rvf archive.tar 4.txt 
4.txt 


$ tar tf archive.tar #list the content of tar file 
Lee 
2.txt 
3.txt 
4.txt 


By default, tar only archives multiple files; it doesn’t 
do any compression. There are various compression 
utilities available like bzip2, gzip, zip and so on. To 
compress a tar bundle using bzip2, execute the command 
show below: 


$ bzip2 archive.tar 


After compression, it will append the .bz2 extension to the 
tar bundle. If you compare sizes, before compression, the tar 
bundle was 20KB and after compression, it gets reduced to 
A4KB, as shown below: 


# Size before compression 
$ 1s -sh archive. tar 
20K archive. tar 


# Size after compression 
$ 1s -sh archive.tar.bz2 
4.0K archive.tar.bz2 


5) fdisk 


We partition disks for better management and utilisation 

of available storage. GNU/Linux provides the gnome-disk 
utility which is a GUI based application. However, we can do 
similar things with fdisk, which is a CLI based utility and can 
be used to manipulate the disk partition table. 

Manipulating disk partitions recklessly will cause data loss; 
hence, we are going to use the fdisk command with a pseudo 
disk. We’ll use a file as a disk, using the losetup command. 
Perform the steps given below to create a pseudo disk. 

First, create a file of size 200MB using the dd command: 


$ dd if=/dev/zero of=disk.img bs=1M count=200 


Next, set up this file as a loop-back device so that, 
hereafter, we can use /dev/loop0 as a device: 


$ sudo losetup /dev/loopO disk.img 


We can perform various actions with fdisk like printing 
the partition table, creating new partitions, deleting existing 
partitions, writing the partition table to disk, and so on. Let us 
perform all these actions, one by one. 

To start the fdisk utility, use the command given below: 


$ sudo fdisk /dev/loop® 


After entering the above command, you will be shown a 
welcome message and the system will wait for a command to 
be entered. The section below describes various actions that 
can be performed using fdisk. 

" Print the partition table 

To print the partition table, type p and press Enter. This 

will display information about the disk and its partitions. 

As we haven’t created any partition yet, it will show only 

information about the disk. 
= Create a new partition 

To create a new partition, type n and press Enter. Then 

follow the on-screen instructions. 

" Validate a created partition 

To view the created partitions, type p and press Enter. 
" Delete a partition 

To delete a partition, type d and press Enter. Follow the 

on-screen instructions to complete the procedure. 
" Write a partition table 

To make the changes permanent, we need to write this 

partition table to the disk. Type w and press Enter to 

complete this action. 
= Quit 
Type q and press Enter any time to quit the fdisk utility. 


6) Networking related commands 
Networking is an essential part of a computer system. 
However, it is complex and can be unstable sometimes. This 
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section discusses a few open source utilities that will help to 
debug networking related issues. 


"ping 
We can use the ping command to check connectivity 
between hosts. It uses the Internet Control Message 
Protocol (ICMP) to check connectivity. Given below is 
the syntax of the command: 


$ ping [ADDRESS OF HOST] 


If you are connected to the Internet and the host 
is reachable, it’ll start displaying ping statistics. Press 
Ctrl+c to abort it. We can also specify the packet count 
for ping. It’ll stop automatically after sending count 
packets. For instance, the command below will stop after 
sending four packets: 


$ ping -c 4 google.com 


= host 
This is the DNS lookup utility, which can be used to 
convert host names to IP addresses and vice versa. For 
instance, the command below prints all the IP addresses 
attached to the google.com domain: 


$ host google.com 

google.com has address 216.58.203.206 

google.com has IPv6 address 2404: 6800: 4009: 806: :200e 
google.com mail is handled by 30 alt2.aspmx.1.google.com. 
google.com mail is handled by 10 aspmx.1.google.com. 
google.com mail is handled by 50 alt4.aspmx.1.google.com. 
google.com mail is handled by 20 alti1.aspmx.1.google.com. 
\google.com mail is handled by 40 alt3.aspmx.1.google.com. 


Alternatively, you can also use the nslookup utility for 
DNS lookup. 


= route 
The route command is used to display routing table 
information. This table is maintained by the operating 
system. Execute the command given below to display the 
routing table on your host: 


$ route -n 


= traceroute 
When we send a packet from source to destination, it may 
travel through multiple gateways. If we want to find those 
intermediate gateways, then we can use the traceroute 
command as follows: 


$ traceroute -n google.com 


The above command will show all the intermediate 
gateways between your host and google.com. 


7) wget 

Often, we download contents from the Internet/network. 
Most of the time, we use the browser to do this. However, 
GNU/Linux provides the wget utility, which can be used as a 
network downloader. This section describes a few examples. 
Given below is the syntax of the wget command: 


wget [OPTIONS] [URL] 


While downloading, it displays the progress bar, which 
shows the following: 

-- Percentage of the download completed 

-- Total amount of bytes downloaded so far 

-- Current download speed 

-- Remaining time to download 

Like other utilities, wget is also a powerful utility. It 
provides various facilities to make our life easy. If your 
Internet connection is not stable, then downloading may be 
interrupted. In that case, we can provide a retry count. For 
instance, in the example that follows, we have provided the 
retry count as 3: 


$ wget -t 3 <URL> 


It’ll retry three times before throwing out an error. To 
provide infinite retries, set the retry count to 0 as shown in the 
example below: 


$ wget -t @ <URL> 


Being a very flexible utility, wget can also be used to 
restrict the downloading speed. It provides the --limit-rate 
option for this. For instance, use the command given below to 
set the downloading rate to 512KB: 


$ wget --limit-rate=512K <URL> 


One of the nice things about wget is that if downloading 
is interrupted, then it can be resumed from that point. Use the 
command given below to resume an interrupted download: 


$ wget -c <URL> 


8) Working with a remote host 

We often interact with remote hosts to download or upload 
content. This section discusses command line utilities that 
will perform these tasks. 


"scp 
One of the common tasks is to transfer files between the 
remote and the local host. GNU/Linux provides a remote 
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copy program, namely ‘scp’, which stands for ‘secure 
copy’. It uses ssh for data transfer, and uses the same 

authentication and provides the same security as ssh. 

Given below is the syntax of the scp command: 


scp [OPTIONS] user@src-host:/src-dir user@dst-host:/target- 
dir 


To copy the contents from a remote host to the local one, 
execute the command given below in a terminal: 


$ scp -r user@remote-host.com:/remote-dir-path local-dir- 
path 


In the above example: 

-- r option stands for recursive. It will be useful while 
copying directories 

-- user is the user name of the remote host 

-- remote-host.com is the IP address/DNS of the remote host 


= ssh 
Sometimes, we need to execute a command on the 
remote host. Obviously, we can log in to that server 
and execute the command there, but what if we want 
to capture the output of that command and use it on the 
local machine? In such a scenario, we can instruct SSH 
to execute the command on the remote host using the 
syntax given below: 


$ ssh user@remote-host.com [COMMAND] 


For instance, the command given below executes the Is 
command on the remote host: 


$ ssh user@remote-host.com ls 


"rsync 
rsync is aremote as well as local file-copying tool. 
The rsync utility is used to synchronise the files 
and directories from one location to another in an 
effective way. 
To synchronise directories on the local host, execute the 
rsync command as follows: 


$ rsync -zvr src-dir target-dir 


In the above example: 

-- Z option stands for ‘enable compression’ 

-- v option stands for ‘verbose mode’ 

-- r option stands for ‘recursive mode’ 

To synchronise the remote directory, we have to provide 
an IP address and user name for that host. For instance, the 
following command synchronises the local directory with the 
remote host: 


$ rsync -zvr src-dir user@remote-host.com:target-dir 


9) cron 
We perform many kinds of tasks on a day-to-day basis; 
for instance, taking backup of important data, checking 
for updates, and many more. Wouldn’t it be great if we 
automate these tasks? We can achieve this using cron. We 
can write cron jobs, which will be scheduled periodically. 
This section provides practical examples of cron. 

To list all available cron jobs, execute the command 
given below: 


$ crontab -1 


If any cron job is configured, then it’ll be listed here; 
otherwise, the output will be empty. 

Cron jobs are stored in plain text files. To edit those files, 
we have to use the crontab command. But before that, let us 
understand the cron job format. 

A cron job consists of the following six entries: 


MH DOM MON DOW COMMAND 


In the above example: 

-- M stands for ‘minutes’ 

-- H stands for ‘hour’ 

-- DOM stands for ‘day of the month’ 

-- MON stands for ‘month’ 

-- DOW stands for ‘day of the week’ 

-- COMMAND field indicates the command/script to be 
executed periodically 

For instance, to run a job at 5.00 am every week, we can 
add the following entry: 


0S * * 4 Seripi.sh 

To add the above entry into cron, perform the 
following steps: 

-- Enter the crontab -e command in the terminal and 
follow the on-screen instructions: 
$ crontab -e 

-- Add the cron job entry and save the file. 
05°” 2 script, sh 

That’s it; and cron will schedule this job at the right time. 
10) System monitoring 
GNU/Linux provides many utilities to monitor the system. 
We can monitor memory usage, disk usage, CPU usage and 


so on. This section discusses some of the popular utilities that 
can be used to monitor memory and disk usage. 
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= free of files/directories. 
GNU/Linux provides the free command to check memory To calculate the size of the directory, execute the 
usage. It displays the total amount of free and used command given below: 
physical and swap memory in the system, as well as the 
buffers and caches used by the kernel. Shown below is a $ du -sh DIR-PATH 


sample output of the free command: 
In the above example: 


$ free -- option s is used to display only a total for each argument 
total used free shared buff/cache available -- option h is used to show the output in human readable 
Mem: 8117768 1267836 3718996 153112 3130936 6393176 format (K for KB, M for MB and so on) 
Swap: 2097148 0 2097148 
7 df 
In the above output: As the name suggests, the df command is used to get 
-- ‘total’ stands for the total installed memory on current information about free disk space. It reports the file system’s 
system disk space usage. If no file name is given, the space available 
-- ‘used’ stands for the used memory. It is calculated as on all currently mounted file systems is shown: 
follows: [total - (free + buffers + cache) memory] 
-- ‘free’ stands for unused memory $ df -h 
-- ‘shared’ stands for the shared memory used by tmpfs 
-- ‘buffers’ stands for the memory used by kernel buffers In the above example: 
-- ‘cache’ stands for the memory used by the page cache -- option h is used to show output in a human readable 
and slabs format (K for KB, M for MB and so on) 
-- ‘buff/cache’ stands for the sum of the buffers and cache In this article, we have discussed some of the popular 
memory GNU/Linux utilities briefly. Mastering these utilities will take 
-- ‘available’ stands for an estimation of how much your knowledge to the next level. To know more about each 
memory is available for starting new applications, without —_ utility, do refer to the official documentation. ENDL @\ 
swapping. 


—< 


By: Narendra K. 
= du 


The author is a FOSS enthusiast. He can be reached at 


As the name suggests, the du command is used to narendra0002017@gmail.com. 


calculate disk usage. It summarises disk usage of the set 


Continued from page...60 


Spark can be deployed as a standalone cluster (if paired 


with a capable storage layer) or can hook into Hadoop as an [1] https://www.forbes.com/sites/bernardmarr/2015/06/22/ 
alternative to the MapReduce engine. spark-or-hadoop-which-is-the-best-big-data-framework/ 
[2] https://www.kdnuggets.com/2016/03/top-big-data- 
. processing-frameworks.html 
Apache Flink [3] https://www.digitalocean.com/community/tutorials/ 
Apache Flink is an open source platform; it is a streaming hadoop-storm-samza-spark-and-flink-big-data- 
data flow engine that provides communication, fault frameworks-compared 


4] http://hadoop.apache.org/ 
tolerance and data distribution for distributed computations ieee: pa 7 


over data streams. It is a scalable data analytics framework 
that is fully compatible with Hadoop. Flink can execute both 


stream processing and batch processing easily. By: Prof. Madhuri Chopade and Prof. Dulari Bhatt 


While Spark performs batch and stream processing, its Prof. Madhuri Chopade is an assistant professor in the IT 
streaming is not appropriate for many use cases because of its department at Gandhinagar Institute of Technology (affiliated 
micro-batch architecture. Flink’s stream-first approach offers to GTU). She can be contacted at madhuri.chopade@git.org.in. 
low latency, high throughput, and real entry-by-entry processing. Prof. Dulari Bhatt is an assistant professor in the IT 

Table 1 Compares the features of Hadoop, Storm, Samza, department at Gandhinagar Institute of Technology. She can 
Spark and Flink. [0 oO be contacted at dulari.bhatt@git.org.in. 
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How To 


DevOps Series 


Ansible Deployment of Nginx with SSL 


This is the 12th article in the DevOps series. It is a tutorial on installing Nginx with SSL. 
Nginx is a high performance Web server and can be used as a load balancer. 


can be used as a load balancer, reverse proxy and 

HTTP cache server. Nginx was designed to handle 
over 10,000 client connections and has support for TLS 
(transport layer security) and SSL (secure sockets layer). It 


N ginx is a Web server written in C by Igor Sysoev. It 


requires a very low memory footprint and is IPv6-compatible. 


Nginx can also be used as a mail server proxy. It was first 
released in 2004 under a BSD-like licence. 

The OpenSSL project provides a free and open source 
software security library that implements the SSL and TLS 
protocols. This library is used by applications to secure 
communication between machines in a computer network. 
The library is written in C and Assembly, and uses a dual- 


licence — Apache License 1.0 and a four-clause BSD licence. 


The library implements support for a number of ciphers and 
cryptographic functions. It was first released in 1998 and is 
widely used in Internet Web servers. 

An Ubuntu 16.04.1 LTS guest virtual machine (VM) 
instance using KVM/QEMU is chosen to install Nginx. 


$ cat /etc/lsb-release 

DISTRIB_ID=Ubuntu 

DISTRIB_RELEASE=16 .04 
DISTRIB_CODENAME=xenial 
DISTRIB_DESCRIPTION="Ubuntu 16.04.1 LTS” 


The default installation on the guest VM does not come 
with Python2, and hence you need to install this on the guest 
machine, manually, as shown below: 


$ sudo apt-get update 
$ sudo apt-get install python-minimal 


+ 


The host system is a Parabola GNU/Linux-libre x86_64 
system, and Ansible is installed on the host system using the 
distribution package manager. The version of Ansible used is 
2.4.2.0 as indicated below: 


$ ansible --version 
ansible 2.4.2.0 

config file = /etc/ansible/ansible.cfg 

configured module search path = [u’/home/shakthi/.ansible/ 
plugins/modules’, u’/usr/share/ansible/plugins/modules’ ] 

ansible python module location = /usr/lib/python2.7/site- 
packages/ansible 

executable location = /bin/ansible 

python version = 2.7.14 (default, Sep 20 2017, 01:25:59) 
[GCC 7.2.0] 


You should add an entry to the /etc/hosts file for the guest 
Ubuntu VM, as follows: 


192.168.122.244 ubuntu 


On the host system, let’s create a project directory 
structure to store the Ansible playbooks, inventory and 
configuration files, as follows: 


ansible/inventory/kvm/ 
/playbooks/configuration/ 
/playbooks/admin/ 
/files/ 


An ‘inventory’ file is created inside the inventory/kvm 
folder that contains the following: 
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ubuntu ansible _host=192.168.122.244 ansible_connection=ssh 
ansible_user=ubuntu ansible_password=password 


You should now be able to issue commands to the guest 
OS, using Ansible. For example: 


$ ansible -i inventory/kvm/inventory ubuntu -m ping 
ubuntu | SUCCESS => { 

“changed”: false, 

“ing”: “pong” 


} 


Installing Nginx 


The Nginx software package in Ubuntu can be installed on the 


guest machine. The APT package repository is first updated 
before installing the Nginx Web server. The Uncomplicated 
Firewall (UFW) is then used to enable both HTTP and 
HTTPS access on the guest OS. The Web server is then 
started, and the playbook waits for the server to listen on port 
80. The Ansible playbook is provided below for reference: 


- name: Install nginx 
hosts: ubuntu 
become: yes 
become_method: sudo 
gather_facts: true 
tags: [nginx] 


tasks: 
- name: Update the software package repository 
apt: 
update_cache: yes 


- name: Install nginx 
package: 
name: “{{ item }}” 
state: latest 
with_items: 
- nginx 


- name: Allow Nginx Full 
ufw: 
rule: allow 
name: Nginx Full 
state: enabled 


- name: Allow Nginx Full 
ufw: 
rule: allow 
name: OpenSSH 
state: enabled 


- name: Start nginx 


service: 
name: nginx 
state: started 


- wait_for: 
port: 80 


The above playbook can be invoked as follows: 


$ ansible-playbook -i inventory/kvm/inventory playbooks/ 
configuration/nginx-ssl.yml --tags nginx -K 


The -K option prompts for the sudo password of the 
Ubuntu user. You can append multiple -v to the end of the 
playbook invocation to get a more verbose output. 

If you open a browser on the host system with the URL 
http://192.168,122.244, you should see the default Nginx 
home page as shown in Figure 1. 


Welcome to nginx! 


If you see this page, the nginx web server is successfully installed and 
working. Further configuration is required. 


For online documentation and support please refer to nginx.org. 
Commercial support is available at nginx.com. 


Thank you for using nginx. 


Figure 1: Nginx home page 


Generating SSL certificates 

The required SSL certificates need to be created using Ansible. 
The OpenSSL and Python-openssl packages are installed after 
updating the APT software repository in Ubuntu. An OpenSSL 
private key is generated in the /etc/ssl/private/ansible.com.pem 
file. The /etc/ssl/csr directory is created before generating the 
OpenSSL certificate signing request (CSR) with the required 
certificate parameters in the /etc/ssl/csr/www.ansible.com. 

csr file. The actual self-signed certificate is then generated as 
shown in the following playbook: 


- name: Create SSL certificates 
hosts: ubuntu 
become: yes 
become_method: sudo 
gather_facts: true 
tags: [ssl] 


tasks: 
- name: Update the software package repository 
apt: 
update_cache: yes 


- name: Install openssl 
package: 
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name: “{{ item }}” 

state: latest 
with_items: 

- openssl 

- python-openssl 


- name: Generate an OpenSSL private key 
openssl_privatekey: 
path: /etc/ssl/private/ansible.com.pem 


- name: Create directory 
file: 
path: /etc/ssl/csr 
state: directory 
mode: 0755 


- name: Generate an OpenSSL Certificate Signing Request 
openssl _csr: 

path: /etc/ssl/csr/www.ansible.com.csr 
privatekey_path: /etc/ssl/private/ansible.com.pem 
country_name: IN 
organization_name: Ansible 
email_address: author@shakthimaan.com 
common_name: www.ansible.com 


- name: Generate a self signed certificate 
openssl _certificate: 
path: /etc/ssl/certs/nginx-selfsigned.crt 
privatekey_path: /etc/ssl/private/ansible.com.pem 
csr_path: /etc/ssl/csr/www.ansible.com.csr 
provider: selfsigned 


The above playbook can be run as follows: 


$ ansible-playbook -i inventory/kvm/inventory playbooks/ 
configuration/nginx-ssl.yml --tags ssl -K 


Configuring Nginx for SSL 

The final step is to configure Nginx to use SSL. A self-signed. 
conf file is created in the /etc/nginx/snippets folder that 
contains the following: 


ssl_certificate /etc/ssl/certs/nginx-selfsigned.crt; 
ssl_certificate_key /etc/ssl/private/ansible.com.pem; 


The SSL parameter configurations are stored in the /etc/ 
nginx/snippets/ssl-params.conf file as shown below: 


# from https://cipherli.st/ 
# and https://raymii.org/s/tutorials/Strong_SSL_Security_On_ 
nginx .html 


ssl_protocols TLSv1 TLSv1.1 TLSv1.2; 
ssl_prefer_server_ciphers on; 


How To 


ssl_ciphers “EECDH+AESGCM: EDH+AESGCM : AES256+EECDH: AES256+E 
DH”; 

ssl_ecdh_curve secp384r1; 
ssl_session_cache shared:SSL:10m; 
ssl_session_tickets off; 
ssl_stapling on; 

ssl_stapling_verify on; 

resolver 8.8.8.8 8.8.4.4 valid=300s; 
resolver_timeout 5s; 

# Disable preloading HSTS for now. 
out header line that includes 

# the “preload” directive if you understand the implications. 
#add_header Strict-Transport-Security “max-age=63072000; 
includeSubdomains; preload”; 

add_header Strict-Transport-Security “max-age=63072000; 
includeSubdomains” ; 

add_header X-Frame-Options DENY; 

add_header X-Content-Type-Options nosniff; 


You can use the commented 


The Nginx Web server configuration for this host (google. 
com, for example) is then created in the /etc/nginx/sites- 
enabled folder with the following contents: 


server { 
listen 80; 


root /var/www/html; 
index index.nginx-debian. html; 


server_name google.com www.google.com; 


} 

server { 
listen 443 ssl http2 default_server; 
include snippets/self-signed.conf; 
include snippets/ssl-params.conf; 

} 


The Ansible playbook for configuring Nginx with SSL is 
as follows: 


- name: Setup nginx with SSL 
hosts: ubuntu 
become: yes 
become_method: sudo 
gather_facts: true 
tags: [https] 


tasks: 
- copy: 
src: ../../files/self-signed.conf 
dest: /etc/nginx/snippets/self-signed.conf 
owner: root 
group: root 
mode: 0644 
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- copy: 
src: ../../files/ssl-params.conf 
dest: /etc/nginx/snippets/ssl-params.conf 
owner: root 
group: root 
mode: 0644 


- copy: 
src: ../../files/google.com 
dest: /etc/nginx/sites-enabled/google.com 
owner: root 
group: root 
mode: 0644 


- name: Restart nginx 
service: 
name: nginx 
state: restarted 


- wait_for: 
port: 443 


The above playbook can be executed as follows: 


$ ansible-playbook -i inventory/kvm/inventory playbooks/ 
configuration/nginx-ssl.yml --tags https -K 


You can now open https://192.168.122.244 in a browser 
on the host system, and view the self-signed certificate as 
shown in Figure 2. 

After accepting the certificate, you will be able to see the 
default Nginx home page as shown in Figure 3. 

You can also use the curl command to view the home 
page from the command line, as follows: 


$ curl https://192.168.122.244 -k 
<!DOCTYPE html> 
<html> 
<head> 
<title>Welcome to nginx!</title> 
<style> 
body { 
width: 35em; 
margin: @ auto; 
font-family: Tahoma, Verdana, Arial, sans-serif; 
i 
</style> 
</head> 
<body> 
<hi>Welcome to nginx!</h1> 
<p>If you see this page, the nginx web server is successfully 
installed and 
working. Further configuration is required.</p> 


General Details 


Could not verify this certificate because the issuer is unknown. 


Issued To 

Common Name (CN) www.ansible.com| 
Organization (O) Ansible 

Organizational Unit (OU) <Not Part Of Certificate> 
Serial Number 38:A0 

Issued By 

Common Name (CN) <Not Part Of Certificate> 
Organization (O) <Not Part Of Certificate> 


Organizational Unit (OU) <Not Part Of Certificate> 
Period of Validity 


Begins On January 8, 2018 
Expires On January 6, 2028 
Fingerprints 


4E:97:DC:42:82:09:97:80:DF:C8:E9:2F:FD:A7:97:1B: 
9A:71:49:5B:BB:2C:B8:45:9A:47:1D:E1:1B:A4:91:9B 


SHA-256 Fingerprint 


SHAI Fingerprint 44:26:B5:D3:F9:CE:D5:76:93:68:8D:8F:2B:A6:46:19:5A:57:A7:DE 


Close 


Figure 2: SSL certificate 


92.168.122.244 —w +» = 


curren! ) 
Welcome to nginx! 


If you see this page, the nginx web server is successfully installed and 
working. Further configuration is required. 


For online documentation and support please refer to nginx.org. 
Commercial support is available at nginx.com. 


Thank you for using nginx. 
Figure 3: Nginx HTTPS home page 


<p>For online documentation and support please refer to 
<a href="http://nginx.org/">nginx.org</a>.<br/> 
Commercial support is available at 


<a href="http://nginx.com/">nginx.com</a>.</p> 


<p><em>Thank you for using nginx.</em></p> 
</body> 
</html> 


Validation 

You can run a number of validation checks on the SSL certificate, 
periodically, to ascertain that it still holds good and meets your 
requirements. A few examples of sanity checks that you can 
perform on the certificate are shown below for reference: 


- name: Validate SSL certificate 
hosts: ubuntu 
become: yes 
become_method: sudo 
gather_facts: true 
tags: [validate] 
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tasks: 
- name: Certificate matches with the private key 
openssl_certificate: 
path: /etc/ssl/certs/nginx-selfsigned.crt 
privatekey_path: /etc/ssl/private/ansible.com.pem 
provider: assertonly 


- name: Certificate can be used for digital signatures 
openssl_certificate: 
path: /etc/ssl/certs/nginx-selfsigned.crt 
provider: assertonly 
key_usage: 
- digitalSignature 
key_usage_strict: true 


- name: Certificate uses a recent signature algorithm (no 
SHA1, MD5 or DSA) 
openssl_certificate: 
path: /etc/ssl/certs/nginx-selfsigned.crt 
provider: assertonly 
signature_algorithms: 
- sha224WithRSAEncryption 
- sha256WithRSAEncryption 
- sha384WithRSAEncryption 
- sha512WithRSAEncryption 
- sha224WithECDSAEncryption 
- sha256WithECDSAEncryption 
- sha384WithECDSAEncryption 
- sha512WithECDSAEncryption 


- mame: Certificate matches the domain 
openssl certificate: 
path: /etc/ssl/certs/nginx-selfsigned.crt 
provider: assertonly 
subject_alt_name: 
- DNS:www.ansible.com 


- name: Certificate is valid for another month (30 days) 
from now 
openssl _ certificate: 
path: /etc/ssl/certs/nginx-selfsigned.crt 
provider: assertonly 
valid_in: 2592000 


You can invoke the above validation checks in the 
playbook using the following command: 


$ ansible-playbook -i inventory/kvm/inventory playbooks/ 
configuration/nginx-ssl.yml --tags validate -K 


Uninstalling 

An uninstall playbook is provided in the playbooks/admin/ 
uninstall-nginx.yml file to stop the Nginx Web server, disable 
access to the port in the firewall, and to remove the software 


from the guest VM: 


- name: Uninstall Nginx 
hosts: ubuntu 

become: yes 
become_method: sudo 
gather_facts: true 
tags: [server] 


tasks: 
- name: Stop the web server 
service: 
name: nginx 
state: stopped 


- name: Disable Nginx Full 
ufw: 
rule: deny 
name: Nginx Full 
state: enabled 


- name: Uninstall apache2 
package: 
name: “{{ item }}” 
state: absent 
with_items: 
- nginx 


The above playbook can be run as follows: 


$ ansible-playbook -i inventory/kvm/inventory playbooks/ 
admin/uninstall-nginx.yml -K 


You can verify the firewall status using the ufw command, 
as shown below: 


$ sudo ufw status 
Status: active 


To Action From 

Nginx Full DENY Anywhere 
OpenSSH ALLOW Anywhere 
Nginx Full (v6) DENY Anywhere (v6) 


OpenSSH (v6) ALLOW Anywhere (v6) 


Please refer to the Nginx documentation website 
(https://nginx.org/en/docs/) for more information. ENDL GS 


By: Shakthi Kannan 


The author is a free software enthusiast and blogs at 
shakthimaan.com. 
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Automate File Classification 
with this Python Program 


Unclutter your Downloads folder in Windows with this extremely useful file 
classifier which is written in Python. After running this script, you will be able to 
manage your downloads with ease. 


files to the Downloads folder of 

Windows OS. If you examine the 
contents of your Downloads folder 
(C:\Users\<user-name>\Downloads) 
you might find various files in diverse 
formats like MP3, PDF, docs, srt, Zip, 
MP4, etc. These files populate your 
C drive, and then you start moving 
the files from the Downloads folder 
to other folders, which can be a very 
tiresome and monotonous task. 

In this article, I am going to show 
you how to shift your files from the 
Downloads folder to their respective 
folders automatically, by using a simple 
Python program. 


/ nternet users download several 


The structure of the program 
The complete program contains a 
Python code and one txt file. The txt 
file supplies the necessary parameters 
and rules for moving the files. 


First, let’s write a text file called 
rules.txt, as shown below: 


C:\Users\<user -name>\Downloads 
R 

mp3:D:\music 

pdf :D:\all_pdf 

exe:D:\software 
doc:D:\all_docs 
docx:D:\all_docs 
srt:D:\all_srts 

mp4 :D:\all_mp4 


The first line indicates the complete 
path of the Downloads folder; so it is 
user-specific. The second line states the 
mode. I used ‘R’ for recursive mode 
and ‘S’ for simple mode. In recursive 
mode, the interpreter checks folders 
inside Downloads. In simple mode, the 
interpreter checks the files only in the 
Downloads folder. 

The remaining lines show the file 
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extensions with paths. This means that 
MP3 files must be moved to D:\music 
folder and so on. 

Let us look at the program class1.py, 
and try and understand it: 


# program created by mohit 
# official website L4wisdom.com 
# email-id mohitraj.cs@gmail.com 


Import the mandatory files: 


import os 
import shutil 


The following lines indicate the path 
of the rules.txt file. In your Windows 
folder, this should be C:\Users\<user- 
name>\. 


File_path= os.path.expanduser(‘~’) 
file_name = file_path+”’//" +"rules.txt” 


Get the path and mode from rules. txt: 


Filet = open(file_name, ‘r’) 
path=file_t.readline( ) 

mode = file_t.readline( ) 

mode = mode.strip(“\n”).lower() 
path1 = path.strip(“\n”) 


The following function returns a 
dictionary that contains the rules. The 
dictionary’s keys behave as extensions 
of the files, and values act as moved 
paths, respectively. 


def rules(): 
dict1 = {} 
for each in file_t: 
each = each.strip(“\n”) 
if each.split(“:”,1)[0]: 
file_ext,dest_path = each. 
splat(’ 3") 4) 
file_ext = file_ext.strip() 
dest_path = dest_path. 
strip() 
dicti[file_ext]=dest_path 
return dict1 


The following function takes a list 
of files, and moves the files to their 
respective folders: 


def file_move(files_list): 
for file in files_list : 
ip.” an tiles 
ext = file.rsplit(“.”,1)[1] 
ext= ext.strip() 
af ext in dLeti: 
dst = dicti[ext] 
try: 
print file 
shutil.move(file, 
dst) 


except Exception as e : 


print e 


The following function is used 
when a simple mode is selected: 


def single_dir(path1): 
os.chdir(path1) 
files = os.listdir(”.”) 
file_move(files) 


The following function is used 


when a recursive mode is 
selected: 


def rec_dirs(path1): 
for root, dirs, 
files in os.walk(path1, 
topdown=True, onerror=None, 
followlinks=False): 
#print files 
os.chdir(root) 
file_move/(files) 
print “files are 
moved” 


dict1 = rules() 


if mode =='r’: 
rec_dirs(path1) 

else; 
single_dir(path1) 


Figure 1: Converting the Python program into an .exe file 


Running the program «sx Sa 
So, the program is ready. ee 
Now let us convert the Python 
program into an .exe file. = 

In order to do this, use the — on man 7208 
installer shown in Figure 1. o 
After conversion, it will make 
a directory called class1\dist, 
as shown in Figure 2. Get 
the class1.exe files from the 
directory class1\dist and put 
them in the Windows folder. 
Before doing that, you can 
rename the exe file. Iam 
calling it cfrexe. In this way, 
cfr.exe is added to the system 
path. cfr.exe works like a 
DOS command. 

In order to run the 
program, use Windows’ run 
facility, as shown in Figure 3. 


Figure 2: exe. files 
evRun x! 


= Type the name of a program, folder, document, or Internet 
resource, and Windows will open it for you. 


Open: cfr > 


() This task will be created with administrative privileges. 


OK Cancel | Browse... 


Figure 3: Run the program 


_) all_docs 7/3/2017 1:10 PM File folder 

After a successful ) all_mp4 7/3/2017 1:06 PM File Folder 
run, check the folders as  all_pdF 7/3{2017 1:06 PM File folder 
a " _) softwares 7/3/2017 1:06 PM File Folder 
specified in the rules. txt. If _) all_srts 7/3/2017 1:06 PM File folder 
the folders do not exist, then music 7/3/2017 1:06 PM File folder 


the program automatically 
creates the folders. [0 8 


Figure 4: Folders are created and files are moved 


<a 


By: Mohit 


The author is a certified ethical hacker and EC Council certified security analyst. He 
has a master's degree in computer science from Thapar University. You can contact 
him at mohitraj.cs@gmail.com and https://www.linkedin.com/in/mohit-990a852a/. 
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Singleton 
Design 
Pattern? 


A Singleton design pattern ensures that only one 
instance of a class is created. 


i magine a disorganised wardrobe with heaps of clothes in Types of design patterns 


every section. How would you organise it? When I asked Design patterns have been categorised by the Gang of Four or 
my students this question, I got the following answers: GoF (Gamma, Helm, Johnson and Vlissides) into three types: 
1. Empty the wardrobe 1. Creational — Patterns that deal with object creation 
2. Arrange the clothes 2. Structural — Patterns that deal with the composition of 
* on the basis of the most frequently used ones classes or objects, and their relationships 
* or the favourite ones 3. Behavioural — Patterns that focus on how objects 
* or by their type (e.g., Indian/Western) distribute work, and how each object working 
3. Fix each section for each category and arrange the independently can help achieve a common goal 


clothes accordingly ; 
If a similar problem occurs in another context, such as The Singleton design pattern 


when re-arranging a bookshelf, the same solution can be Let us look at the website of a restaurant as an example. The 
reused. This is the basic principle behind design patterns. website has a menu link titled Menu. Every time a user clicks 
With respect to object-oriented software engineering, on this link, data is retrieved from the database and stored 

a design pattern describes a redundant problem and in an instance (object) of a class named ‘Menu’. The class 
provides a reusable solution. diagram for ‘Menu’ is shown in Figure 1. 
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Menu 


- dishes: <List>Dish 


+ get MenuDetails():<List>Dish 


+ fetchMenuDetailsFormDatabase() 


Figure 1: Class diagram for ‘Menu’ 


The problem here is that on every subsequent visit to the 
Menu page, even though the menu details have been fetched 
once by the same client, the program fetches them again and 
stores the details in a new object. Is there any way that we 


can reduce the number of objects created for this class to one, 


so that this object can be shared across all the classes? 
One solution is to make the object global. But this is not 
a recommended practice in object-oriented programming as 


this would require tracking of all the global objects and their 


current status in the program. Also, it does not support the 
principle of encapsulation. 

The GoF introduced the Singleton pattern, wherein 
only one instance/object of a class is created. This 
operation is performed in a static member function of 
the class. The constructor of the class is made private so 
that this class cannot be instantiated from anywhere else 
except from within the class itself. The class diagram is 
shown in Figure 2. As you can see, the instance named 
‘singleton’ is made private and static (underlined), and 
getInstance() is again a static member function that 
creates and returns the instance of the class. 

The getInstance member function will first check 
whether the instance is null or not. If it is null, it creates 
a new object. Otherwise it returns the existing object. 

The same process can be used to limit the number of 
objects to any given number by using a counter to keep 
track of the number of objects. 

There are two ways of instantiating the class — eager 
instantiation and lazy instantiation. 

In eager instantiation, the object is created when the 
class is loaded. Therefore, irrespective of whether the 
object is used or not, it is created. The Java code for eager 
instantiation is given below: 


Singleton 


- Singleton : Singleton 


- Singleton() 
+ getinstance() “Singleton 


Figure 2: Class diagram for the Singleton design pattern 
(Source: Attps://en.wikipedia.org/wiki/Singleton_pattern) 


public class EagerInitializedSingleton { 


private static final EagerInitializedSingleton instance = 
new EagerInitializedSingleton(); 


//private constructor to avoid client applications to use 
constructor 
private EagerInitializedSingleton( ) {} 


public static EagerInitializedSingleton getInstance(){ 
return instance; 


On the other hand, lazy instantiation creates 
an object only when the static member function is 
called, as shown below: 


public class LazyInitializedSingleton { 
private static LazyInitializedSingleton instance; 
private LazyInitializedSingleton( ) {} 


public static LazyInitializedSingleton getInstance(){ 
if(instance == null){ 
instance = new LazyInitializedSingleton(); 


} 


return instance; 


Anyone who plans to implement this pattern in their 
program needs to take care of two major points: 
"The constructor needs to be made private so that 
in no circumstances can the class be instantiated 
from any other class. 
"While creating child classes of the singleton class, the 
instance must be initialised with the instance of the 
corresponding child class. am} 
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Using a Single Sign-on in 
Java based Web Applications 


Authentication to several related but independent software systems by logging in with 
a single user ID and password is called single sign-on. It is an access control property, 
whereby the user can seamlessly gain access to multiple connected systems without 

separately signing in with different user names and passwords. 


eb application development is one of the most 
WV ests: areas in IT software development. With 
clients becoming smarter, design architects 
and solutions providers are spending most of their time in 
the research and development of quicker, cheaper, more 
flexible and comprehensive solutions to suit any complex 
need of the user. 

Of the many aspects of Web application development 
(or Web apps in a techie’s language), the most challenging 
and interesting sub-section is the authorisation and sign-on 
facility for a Web app. 

The most accepted and common way of authorisation 
is the log in/password, which the user has to supply. This is 
counter-matched with the records in the store (a database, 
usually) and permission is granted based on criteria like 
roles and permissions. 

As customer demands grow more complex, the 
situation becomes challenging when the development 


community is given more than one Web app interlinked for 

different purposes, and is asked to provide a sign-on for them. 
This is when a single sign-on (SSO) makes sense. This 

article looks at its exact meaning, why it is required and 

how it got its current shape. We will also briefly discuss the 

various facilities/technologies available nowadays for the 

single sign-on. 


SSO: In the early phase of adoption 

To handle authentication for different Web apps using single 
sign-on information, certain features were proposed earlier. 
Two of these are listed below. 

Bypass login: A bypass login is a crude workaround to 
handle logins for multiple related Web apps. It handles a login 
to the second application with a flag as a cookie set during 
the first Web app login. If this flag is set, then the second 
Web app will log in automatically. There is a huge problem 
in this method as the second Web app will not be role or 
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policy based and hence will have the same amount of access/ 
facilities for any login. 

Moreover, the security of the second Web app’s login 
is not guaranteed. To overcome this, there was a minor 
change done in the bypass login, whereby there is an extra 
parameter passed from the first Web app to the second 
one, which is the key to access the second Web app. 

The drawback in this process is that there is always the 
dependency of the key parameter, and the second Web app 
has to be always called from the first one (scheduled or 
automated tasks cannot be run). 

Web service call: This is generally used when the second 
and subsequent Web apps expose certain Web services, and 
the first Web app invokes calls to these Web apps through 
these Web service calls. The problem in this feature is that 
the developer is more dependent on the structure of the input/ 
output of the Web service and is bound to handle all requests 
through them only. 


What is single sign-on? 

Though a bypass login and Web service call provide facilities 
to skip subsequent logins to different Web apps, there are 
some drawbacks in their overall architecture, as seen earlier. 
If we first understand the need for using multiple Web apps, 
we can easily grasp why a single sign-on is required. 

Consider a complex enterprise Web app for a textile 
firm which involves various operations like ordering, 
samples, mailing, designing, etc. If we develop a single 
Web app to cater to all these needs, then maintaining it 
will be highly complex. 

A service provider company that designs software for 
different industries will be always interested in developing 
individual modules and integrating the required pieces when 
desired. This will help in the sizing of the module and its 
maintenance. The process and benefits of this approach are 
as follows: 
= Usage of different technologies in different modules, 

depending on what the module is required to do. 
= Easy usage and faster work due to the delegation of 

several actions in parallel across different modules. 
= Allows for the reuse of the modules for different Web 
solutions, as and when required. 

Now when we think of a single sign-on facility for 
the above case, it’s very obvious that a bypass login ora 
Web service call cannot be advantageous in such real-time 
situations; in fact, they may further complicate the solution as 
it grows bigger in order to integrate more and more modules. 

In short, single sign-on is a facility to skip or use the 
login information used in the first Web app for all subsequent 
Web apps without popping out the login screen for each one. 
There are many standard technologies and techniques used 
for SSO in the industry, and many of them are open source 
or freeware. We will discuss some of the most popular and 
robust technologies here. 


User call (1) 


Web container 


User call to 
Second webapp (3) 


Response (5) 


Figure 1: Bypass login 
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Figure 2: Single sign-on 


But before delving further into the topic, let me share the 
terms that are going to be used hereafter, along with their 
meaning and usage. 

LDAP — Lightweight Directory Access Protocol: This is a 
protocol used commonly in Web apps for looking up certain 
information from the servers; it is widely used in mail servers. 

Datastore: This is the store in LDAP that keeps 
authentication and other access related data. 

Directory server: This provides the datastore for LDAP 
and supports many protocols internally. 

Authentication: This is a piece of information that is 
verified to certify or deny access, like the user name/password. 

Realms: This is a kind of group or system to provide 
facilities for authentication. It offers a lot of features for 
configuration like policy setting, access control, privileges 
for realms, etc, which can be easily configured for role 
based access for different groups and logins. There are two 
configurations in OpenSSO -- Default and New Config. The 
Default configuration is easy. The advantage is that it can be 
customised later and hence is meant for basic or first-time 
users. New Config, however, is meant for advanced users, 
and there may be some problems with getting the directory 
service port or cookie domain if the user doesn’t know each 
configuration setting and doesn’t provide them. 


OAuth 


One of the most popular and commonly used single sign-on 
solutions for Java Web apps is OAuth or OAuth 2.0, as the 
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API library provided has a comprehensive facility for easy 
configuration, and it can be used with different application 
servers and a complex Web component architecture. But there 
are many other open source SSO frameworks available, too, 
which are listed below. 


OpenSsoO 


This is more flexible in configuration as it has its own Web 
app and comes with a GUI based sophisticated configuration 
screen, which can be used to configure settings for a Web 
container using different Web apps for a single sign-on. 

It also needs an LDAP (we will discuss this later) and 
a directory service for handling SSO authentication and a 
data store. It has a default configuration, which uses generic 
LDAPv3. We can configure it using the Sun Java System 
Directory Server for the data store. 


JOSSO 


JOSSO is another facility for SSO which is purely Java based 
and is also called Java Open SSO. The new version of JOSSO 
(1.6) supports all the latest Web servers like JBoss 4.2. This is 
a widely used solution, and the configuration is comparatively 
easier than OpenSSO. 

There are some problems that can arise in JOSSO (as 
discussed a lot in JOSSO forums), particularly in relation to 
JaasSecurity Manager in JBoss 4.x, which was not the case 
with JBoss 3.x. Also, with some proxy settings, JOSSO can 
be difficult to configure. 

JOSSO can also be configured with LDAP, and it uses 
internal JAAS from JBoss for authentication. 


Open LDAP 


OpenLDAP is an open source provider of the SSO facility. It 
is complex to use but has a lot of features for authentication 
and configuration. OpenLDAP cannot be used alone, as 

it also requires an SSO application (like OpenSSO) for 
handling logins. We can use a combination of OpenSSO with 
OpenLDAP configuration. The main drawback in this open 
source application is that there is no good documentation for 
configuring and setting it up. 


Apache Directory Studio 

This is sophisticated technology which can be used as a 
custom authentication facility to the level that the user wants. 
The drawback, if any, is that Apache Directory server is 
supported only in Mac OS X, Linux and Windows. 


SpringLDAP 

SpringLDAP is a comprehensive LDAP feature from the 

Spring framework but it doesn’t work for SSO. It is meant for 

LDAP provisioning for directory services. Hence, it can be 

used with an SSO application for handling directory services. 
There is a sub-module in the Spring framework called 

Acegi security, which can be used for SSO with LDAP. It has 


an easy-to-use approach called ‘persistent login cookie’ to 
bypass login. This is also called RememberMe. 

The RememberMe facility has two types of approaches: 
"It stores login information (base64 encrypted) in a cookie. 
= It stores login information (base64 encrypted) in a 

database. 

For the second approach, we need a table called 
‘persistent logins’ with user details (which needs migration if 
your current Web app uses a different table/database for user 
information). The first approach can be simple and easy-to- 
use, but depends on the client facility to enable cookies. 


Liferay 

Liferay is a ready-to-use portal service provider that has a 
lot of technologies embedded in it to address the various 
requirements of a Web solution. Liferay supports SSO using 
CAS (Central Authentication Service) from Yale. This is an 
open source SSO solution. Please note that CAS requires 
certificate authentication created using a key tool. Also, CAS 
supports the RememberMe feature like the one explained 
above (Spring Acegi RememberMe). 

Liferay supports many other SSO applications like NTLM 
(not freeware) and OpenSSO; it also supports LDAP and 
JAAS. The technical specifications of Liferay are given at 
https://www.liferay.com/product/tech-specs. 

The Liferay admin guide (liferay-administration- 
guide-5.1.pdf) gives details on CAS, OpenID and other 
SSO solutions supported by Liferay, as well as on using 
Liferay with different application servers. [It has details on 
configuring JAAS and configuring SSO in all the latest Web 
servers for Liferay. ] 

This article just helps in giving readers a feel of the 
different technologies that can be used and doesn’t intend 
to be a user guide. It is up to the architect and designer to 
evaluate and analyse these technologies based on their actual 
requirements and bandwidth of usage. am} 


References 


[1] https://aws.amazon.com/blogs/security/new-whitepaper- 
single-sign-on-integrating-aws-openldap-and-shibboleth/ 

[2] https://s3.amazonaws.com/awsiammedia/public/ 
docs/OpenLDAPandShibboleth/SingleSign-On_ 
IntegratingAWSOpenLDAPandShibboleth. pdf 

[3] www.josso.org/architecture.html 

[4] http://docs.atricore.com/josso2/2.4.0/josso-reference- 
guide/html/en-US/JOSSO_Reference. html 

[5] https://docs.spring.io/spring-ldap/docs/current/reference/ 

[6] https://wiki.jasig.org/display/casum/remember+me 

[7] https://apereo.github.io/cas/5. 1.x/installation/Configuring- 
LongTerm-Authentication.html 

[8] https://www.liferay.com/product/tech-specs 


By: Magesh Kasthuri 


The author is a senior distinguished member of the 
technical staff at Wipro. He is an enterprise architect in 
BFSI, and can be reached at magesh.kasthuri@wipro.com. 


78 | FEBRUARY 2018 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com 


Ten Popular Open Source 
Tools for Developers 


Here is a selection of the ten most popular open source tools for 
developers, along with a brief introduction to each. 


a 


wae 


s the new year has just started, let me present the 
Ae ten open source tools of the previous year for 

developers. This list includes version control systems, 
integrated development environments (IDEs), text editors, 


and Web and mobile frameworks. All these are regularly used 
by developers to create new applications. 


1. Git 


Git is a free and open source distributed version control 
system designed to handle everything from small to very large 
projects, with speed and efficiency. With the rise of GitHub, 
Git has become a de facto standard and, according to several 
surveys, is now the most popular version control system 
among software developers. Its users include all the biggest 
names in the technology industry, such as Google, Facebook, 
Twitter, Microsoft, LinkedIn and Netflix. It is also very 
popular with open source projects such as the Linux kernel, 
Eclipse, GNOME and others. Git is easy to learn and has a 


= 


tiny footprint, with lightning fast performance. It outclasses 
SCM tools like Subversion, CVS, Perforce and ClearCase 
with features like cheap local branching, convenient staging 
areas and multiple workflows. 


2. Eclipse 

Eclipse is an integrated development environment (IDE) used 
in computer programming. It is the most widely used Java 
IDE, but it also supports C/C++, PHP and JavaScript. Eclipse 
got started in 2001 when IBM donated three million lines of 
code from its Java tools to develop an open source integrated 
development environment. Eclipse is released under the terms 
of the Eclipse Public License. The Eclipse Foundation, which 
oversees development of the IDE, supports more than 250 
open source projects, most of them related to development 
tools. There are also loads of plugins available that bring code 
quality, version control and other capabilities to the integrated 
development environment. 
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3. Apache HitpClient network stack 
Most native mobile apps use the Internet to ‘phone home’ 
to a database or service. They often rely on a custom 
network stack for this because a browser-based connection 
would be slow, consume too many resources and provide 
an attack surface for malicious code. To implement the 
custom connection between the app and the service 
back-end, Apache’s HttpClient provides a capable HTTP 
connectivity framework that can be used for this purpose. 
HttpClient lets you readily construct the headers 
and bodies of HTTP requests as well as the responses to 
support a private communications session. For example, 
you can easily set up an authenticated transaction using 
an authorised header or grab the data in a Set-Cookie 
header to manage cookies. Curiously, while the Java 
Development Kit (JDK) has its own HTTP stack, it 
doesn’t implement the PATCH method, while HttpClient 
does. PATCH is important because it lets you selectively 
update subsets of data with low network overhead, 
particularly if your service relies on the Open Data 
(OData) Protocol. 


4. Node.js 

Node.js is a JavaScript runtime built on Chrome’s 

VB JavaScript engine. Node.js uses an event-driven, 
non-blocking I/O model that makes it lightweight and 
efficient. A developer can write server side applications. 
In recent years, the project has skyrocketed in popularity 
and its users include IBM, Microsoft, LinkedIn, Netflix, 
PayPal, Yahoo, Walmart and many other well-known Web 
companies. According to its website, the Node.js package 
ecosystem, npm, is the largest ecosystem of open source 
libraries in the world. 


5. Cordova 

Sponsored by the Apache Foundation, Cordova allows 
mobile developers to write for iOS, Android, Windows 
and other platforms using Web development technologies 
like HTML, CSS and JavaScript. Many other mobile 
development frameworks, most notably PhoneGap, are 
based on the Cordova code base. Applications execute 
within wrappers targeted at each platform, and rely on 
standards-compliant API bindings to access each device’s 
sensors, data and network status. 


6. Emacs 

Emacs is a family of text editors that is characterised by 
its extensibility. The manual for the most widely used 
variant, GNU Emacs, describes it as the extensible, 
customisable, self-documenting, real-time display editor. 
GNU Emacs boasts of content-aware editing modes with 
syntax colouring, built-in documentation and tutorials, 
full Unicode support and tools for project planning, 
debugging and more. 


7. Vim 

Vim is a highly configurable text editor that makes 
creating and changing any kind of text very efficient. It is 
included as ‘vi’ with most UNIX systems and with Apple 
OS X. Vim is designed for use both from a command line 
interface and as a standalone application in a graphical 
user interface. Key features include a multi-level undo 
tree, support for hundreds of programming languages, 

an excellent ‘Search and replace’ tool and an extensive 
plugin system. Vim is rock stable and is continuously 
being developed to become even better. 


8. ASPNET 

ASP.NET is an open source server side Web application 
framework designed for the development of dynamic 
Web pages. It was developed by Microsoft to allow 
programmers to build dynamic websites, Web 
applications and Web services. It can be integrated with 
many other Microsoft development tools including 
Visual Studio. It allows you to use a full featured 
programming language such as C# or VB.NET to build 
Web applications easily. 


9. Bootstrap 

Bootstrap is a free and open source front-end Web 
framework for designing websites and Web applications. 
It contains HTML and CSS based design templates 

for typography, forms, buttons, navigation and other 
interface components as well as optional JavaScript 
extensions. Unlike many Web frameworks, it is only 
linked to front-end development. It was developed by 
Twitter and the first version was released in 2011. 


10. Ruby on Rails 

Ruby on Rails, or Rails, is a server side Web application 
framework written in Ruby under the MIT License. 
Rails is a model—view-—controller (MVC) framework, 
providing default structures for a database, a Web 
service and for Web pages. It encourages and facilitates 
the use of Web standards such as JSON or XML for data 
transfer, and HTML, CSS and JavaScript for display and 
user interfacing. In addition to MVC, Rails emphasises 
the use of other well-known software engineering 
patterns and paradigms, including ‘convention over 
configuration’ (CoC), ‘don’t repeat yourself’ (DRY), and 
the active record pattern. Its users include some of the 
most popular services on the Internet, such as GitHub, 
Airbnb, Basecamp and Hulu. ENDL ©) 


By: Neetesh Mehrotra 


The author works at TCS as a systems engineer. His areas 
of interest are Java development and automation testing. He 
can be contacted at mehrotra.neetesh@gmail.com. 
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Top Ten Open Source Tools for 


Building Websites 


You can build a website to grow your business using these popular free and 


open source website building tools. 


owadays, whether you are 
N an individual entrepreneur 

or representing a business 
organisation, a website is a must for 
personal and professional growth. 
Organisations are spending lots of 
money to build attractive websites. 
In this article, we are going to look 
at some of the open source website 
building tools that you can use to build 
your website on your own, without 
much knowledge about programming 
or the Internet. 


San8 SS Bee 
= seen 8 @ 


COMPANY NAME 


1. WordPress 

WordPress is one of the most popular 
open source free CMS (content 
management system) frameworks 
available in the website market. The 
majority of those who don’t know much 
coding prefer to create a website using 
WordPress. There are two variants 

of WordPress. The first is wordpress. 
com, where you can host your own 
website, which will have a domain 
name wordpress.com at the end — for 
example, maulikparekh.wordpress.com. 


Meet WordPress 


—— 


Figure 1: WordPress 


By paying an amount, you can change 
the name to your own domain name, 
say, maulikparekh.com. 

The second variant is wordpress. 
org, where you can download the 
WordPress framework and install it 
on your own host providers. Most 
host providers, by default, support 
WordPress so we don’t need to install it. 
But we need to buy everything, like the 
domain name and hosting space. Based 
on their requirements and experiences, 
people choose different options. 

If you are a fresher, then you can 
start with wordpress.com and later, 
once you are comfortable, you can 
migrate to wordpress.org. If you don’t 
have money constraints, then I will 
definitely advise you to go with your 
own hosting website using wordpress. 
org because you can get more freedom, 
speed, performance and flexibility, all 
of which are not optimal in wordpress. 
com. The WordPress framework is 
written in PHP. 

The official websites for WordPress 
are https://wordpress.com and 
https://wordpress.org/. 
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Developers @N/-lav(-) 


2. Joomla ——— —— 
Joomla is an open source free CMS Rf Joomlal ame 
platform similar to WordPress, and [ashial GOL AaDANS Mi Ra RSTO 

is the most popular framework after 

WordPress. Joomla is also built on PHP. The Flexible Platform 

Joomla’s installation is not as quick as Empowering Website Creators 
WordPress, but it’s not tough either. scam 9 an aang cater yt CS) wen ees you 


There are lots of menus in Joomla 
compared to WordPress. Many people 
think that Joomla is more powerful than & (oO) 
WordPress. Overall, anyone can opt for Download Joomla! Use Joomla! Try Joomla! 
Joomla, since it is a stable and popular a : x ha Soanvareee 
framework. In the case of Joomla, too, 
you can host your website on it by Figure 2: Joomla 
downloading the Joomla framework and 
creating self-hosted websites. 

The official website for Joomla is 
https://www.joomla.org/. 


3. Drupal Rare 
Drupal is another open source CMS tthe coe 
framework like Joomla and WordPress. 
Drupal’s installation is also similar to Build soutions wth Orval 
Joomla and WordPress. There is an 
installation script which will install 
the framework. It also offers different 
distributions. Drupal bundles are used 
to create specific kinds of websites. Figure 3: Drupal 
Drupal is a bit complicated after 
installation. Beginners find it difficult 
compared to other popular CMS 
systems, when it comes to changing 
components of the websites. 

The official website for Drupal is 
https://www.drupal.org/. 

All the above three CMS platforms 
are very popular and there are many 


similarities between them. — 4050 
= WordPress, Joomla and Drupal are SS ——— res 
free and open source PHP based Figure 4: OpenCms 


frameworks, licensed under GPL. 

= All three support MySQL as the 
primary database. 

= There are different themes, bundles 
and modules available to add 
different features and functionalities. 


= All are community driven projects ORCHARD CMS 


with very good support. 


a free, open source, community-focused Content 
4. OpenCms Management System built on the ASP.NET MVC platform 
OpenCms is a Java and XML based 


technology, which is used as a 
professional content management Figure 5: Orchard Project 


TRY ORCHARD 
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ce —aowions — aye 7 system. We can create beautiful 
websites using OpenCms, which is 
concrete5 is an Open Source also open source software. 
Content Management System (CMS) The official website for OpenCms 


is http://www.opencms.org/en/. 


5. Orchard Project 
Orchard Project is a free, open source 
CMS built on the ASP.Net MVC 
platform. It is also a community 
driven project. Some of the features 
of this CMS are modularity, security, 
Figure 6: concreteS multi-lingual support, and being 
multi-tenant based; we can also create 
workflows to trigger specific jobs. 
The official website for Orchard 
Project is http://www.orchardproject. 
net/, 


h i 
Create the web 6. concreted 


concrete is a free and open source 
CMS that is coded in MVC (model- 
view-controller) software architectural 
pattern with an object-oriented 
language. It is secure, flexible, SEO 


Figure 7: SilverStripe friendly and mobile ready. It also has 
_ — =w a marketplace. It has a medium-sized 
& Mopx cc community base and support. 


The official website for concrete5 
is https://www.concrete5.org/. 


What Exactly Is MODX? 
7. SilverStripe 


© Product SilverStripe is a free and open 


source CMS framework for creating 
SS Eee websites. It’s a PHP based framework, 
‘- which also provides aWYSIWYG 
— | ad 2h © "geil website builder. It is not as popular as 


WordPress, Joomla and Drupal. 
The official website for it is 


Figure 8: MODX https://www.silverstripe.org/. 
8. MODX 
django (as MODxX is also a free and open source 


PHP based framework for publishing 
content on content on the World Wide 
Web and intranets. 

The official website is 
https://modx.com/. 


9. django CMS 


As the name suggests, django CMS is 
based on Django as well as Python. 


Figure 9: django CMS Continued on page... 88 
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An Introduction to NumPy 


Numerical Python or NumPy is a Python programming language library that supports 
large, multi-dimensional arrays and matrices, and comes with a vast collection of high- 
level mathematical functions to operate on these arrays. 


atrix-sig, a special interest group, was founded in 
Ni 1995 with the aim of developing an array computing 

package compatible with the Python programming 
language. In the same year, Jim Hugunin developed a 
generalised matrix implementation package, Numeric. Later, 
in 2005, Travis Oliphant incorporated features of NumArray 
and C-API into Numeric code to create NumPy (Numerical 
Python), which was developed as a part of the SciPy project. 
These were separated from each other to avoid installing the 
large SciPy package just to get an array object. 

In the rest of this article, ‘>>>’ represents the Python 

interpreter prompt; statements without this prompt show the 
output of the code. 


Installation 

NumPy is a BSD-new licensed library for the Python 
programming language. It comes with Python distributions like 
Anaconda, Enthought Canopy and Pyzo. It can also be installed 
using package managers like dnf or pip. In Ubuntu and Debian 
systems, use the following code for installing NumPy: 


sudo apt-get install python-numpy 


Once NumPy is installed in the system, we need to import 
it to use the functionalities provided by it: 


import numpy as n 


The above command will create an alias ‘n’ while 
importing the package. Once you import the package, it will 
be active till you exit from the interpreter. If the programs are 
saved in files, you must import NumPy in each file. 


NumPy, SciPy, Pandas and Scikit-learn 

In Python, many packages provide support for scientific 
applications. NumPy, like Matlab, is used for efficient array 
computation. It also provides vectorised mathematical 
functions like sinQ and cos(). SciPy assists us in scientific 
computing by providing methods for integration, 
interpolation, signal processing, statistics and linear algebra. 
Pandas helps in data analysis, statistics and visualisation. 
We use Scikit-learn when we want to train a machine 
learning algorithm. It seems like NumPy is inferior to all 
these packages. But the beauty is that all these packages use 
NumPy for their working. 


ndarray 

The soul of NumPy is ‘ndarray’, an n-dimensional array. 
You can see from the code segment given below that 
whenever you apply the function ‘type’ on any NumPy array, 
it will return the type numpy.ndarray irrespective of the 

type of data stored in it. 


>>>import numpy as n 
>>>a=n.array([1,2,3]) 


84 | FEBRUARY 2018 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com 


>>>type(a) 

<type ‘numpy.ndarray’> 
>s>b=narray([‘a’,"b’, 'e")) 
>>>type(b) 

<type ‘numpy.ndarray’> 


Unlike the list data structure in Python, ndarray holds 
elements of the same data type only. A few attributes of the 
ndarray object are listed below. Examples shown with the 
description of each attribute refer to the array ‘a’, whose 
definition is as follows: 


>>ra=n.array([[1,2,3],[3,4,5]]) 


ndarray.ndim: This displays the number of dimensions 
of the array. The array ‘a’ in the example is two-dimensional, 
and hence the output is as follows: 


>>>a.ndim 
2 


ndarray.shape: This displays the dimensions of the 
array, as shown below: 


>>>a. shape 
(2,3) 


ndarray.size: This displays the total number of elements 
in the array, as shown below: 


>>>a.size 
6 


ndarray.dtype: This displays the data type of elements, 
depending on the type of data stored in it. Built-in types include 
int, bool, float, complex, bytes, str, unicode, buffer; all others are 
referred to as objects. Default dtype of ndarray is float64. 


>>>a.dtype 
dtype(‘int64’ ) 


In the case of strings, ‘type’ will be displayed as 
dtype(‘S#’), where # represents the length of the string: 


>>>c=n.array([‘sarah’,’serin’,’jaden’]) 
>>>c.dtype 
dtype(‘S5’) 


ndarray.itemsize: This displays the size of each element 
in bytes. In the example, ‘a’ contains integers and the size of 
the integers is 8 bytes. Hence, the output is as follows: 


>>>a.itemsize 
8 


Creating ndarrays 

We have already seen in the above examples how ndarrays are 
created from a Python list using array(). A tuple can also be 
used in place of a list. It is possible to specify explicitly the 
data type of elements in the array, as shown below: 


>>>e=n.array([[1,2,3],[3,4,5]],dtype='S2’) 
>>>e 
arrayei (i, 7204 "3 Ty 
['3’, ‘4°, ‘5°]], 
dtype=’|S2’) 
>>>e.dtype 
dtype(‘S2’) 


There are other ways too for generating arrays. A few of 
them are listed below (italicised words in the description of 
each function denote the arguments to the functions). 

ones(shape[,dtype, order]): This returns an array of given 
dimensions and type filled with 1s. ‘Order’ in the option set 
specifies whether to store the data in rows or columns. An 
example is given below. 


>>>a=n.ones(3,dtype='S1’ ) 

>>>a 

array([‘1’, ‘1’, /1'], 
dtype=’|S1’) 


If the specified dtype is Sn, whatever be the value of ‘n’, 
the array generated by ones() will contain a string of length 1. 
But, at some later point of time, we will be able to replace ‘1’ 
with a string of length up to ‘n’. 

empty(shapel[,dtype, order]): This returns an array of given 
dimensions and type without initialising the entries. In the code 
segment given below, the specified dtype is ‘S1’. Hence the 
array ‘a’ may be modified later to store strings of length 1. 


>>>a=n.empty(3,dtype="S1’ ) 

>>>a 

array(l'’s “» 'y 
dtype=’|S1’) 


full(shape, fill_value[,dtype,order]): This returns an array 
of given dimensions filled with ‘fill_value’. If dtype is not 
explicitly specified, a float array will be generated with a 
warning message. A sample statement is given below: 


>>>a=n. full((3,3),2,dtype=int) 


>>>a 

array([[2; 2, 2], 
[2;/-2) 21, 
[2, 2, 2]]) 


fromstring(string[,dtype,count,sep]): This returns a 
1-D array initialised with ‘string’. NumPy takes ‘count’ 
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elements of type ‘dtype’ from ‘string’ and generates an array. 
‘String’ will be interpreted as a binary if ‘sep’, a string, is not 
provided, and as ASCII otherwise. I would like to add a bit 
more about fromstring(). This function needs the input string 
size to be a multiple of the element size. Unless specified, the 
array created using this function will be of dtype ‘float64’, 
which requires 8 bytes for representation. Consider the 
example given below: 


>>>a=n. fromstring( ‘123’ ) 


The above statement intends to generate an array from the 
string ‘123’. But it will generate an error message since its 
length is not a multiple of 8. 


>>>a=n. fromstring( ‘12345678’ ) 


The above statement will successfully generate an array as 
given below: 


>>>a 
array([ 6.82132005e-38]) 
Consider the example given below: 


>>>a=n. fromstring( ‘123456’, dtype=’S2", count=2) 


Here, dtype is specified as ‘S2’. So the array ‘a’ will 
contain elements of length 2. 


>>>a 
array([‘12’, ‘34’], 
dtype='|S2’) 


We can see that ‘a’ contains only two elements since the 
count given in fromstring() is 2. 

loadtxt(fname[, dtype][,comments] [,skiprows][,delimiter] 
[converters][,usecol].....): This returns an array containing 
elements formed from the data in the file. The contents of an 
input file, say loadtxt.txt, are given below: 


#this is comment line 
abc def ghi 
jkl mno pqr 


Use the function shown below: 


>>>n.loadtxt(‘/home/abc/Desktop/loadtxt. txt’, dtype="S3’ ) 
array([[‘abc’, ‘def’, ‘ghi’], 

[‘jk1", ‘mno’, ‘pgr']], 

dtype='|S3’) 


We can see in the output that the comment statement has 
automatically been eliminated. Before applying this function, 


we must make sure that all rows in the file contain an equal 
number of strings. We can specify in the comments option 
of loadtxt() which character will mark the beginning of the 
comments. By default, it is the ‘#’ symbol. The skiprows option 
will help to skip the first ‘skiprows’ lines in the input file. 
arange([start], stop[, step,][,dtype]): This returns an array 
containing elements within a range. 
There are many more functions that help in generating 
arrays. They are detailed in the official site scipy.org. 


Functions associated with arrays 
We have written programs to find trace, to sort elements, to 
find the index of non-zero elements, to multiply two matrices, 
etc. We know how lengthy these programs are, if written in 
C. Each of these tasks can be finished with a single statement 
using NumPy. The description of a few functions that are used 
is given below. A majority of the functions associated with an 
array return an array. 

nonzero(a): This returns a tuple containing the indices of 
non-zero elements in the array. 


>era=n.array([[0,0,2],[3,0,0], [0,0,0]]) 
>>>n.nonzero(a) 
(array([®, 1]), array([2, 0])) 


We can see in the definition of ‘a’ that indices of non- 
zero elements in it are [0,2] and [1,0]. Each element in the 
result of nonzero() is an array containing the index position 
of the non-zero element in that dimension. In this case, 
the first array contains row numbers and the second array 
contains column numbers of non-zero elements. If we had 
a third dimension in the input array, the tuple would have 
contained one more element showing the positions of non- 
zero elements in that dimension. 


>>>a[n.nonzero(a) ] 
array([2, 3]) 


The above code shows us how to retrieve the non-zero 
elements from the array. 

transpose(a[, axes]): This returns a new ndarray after 
performing a permutation on dimensions. The code segment 
given below shows a 3D array and its transpose. 


>eracn.array((((1,2,3), (4,5,6)),((7,8,9),(9,1,2)))) 
>>>a 
arraye[[[4y 2) 3] 


a Rear 8, 91, 
[0, 4, 2]]]) 
>>>n. transpose(a) 

array([[[4, 7], 


[4, @]], 
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[[2, 8], 
[5, 1]], 


[[3, 9], 
[6, 2]]]) 


sum(a [, axis][,dtype][,out][,keepdims]): This returns the 
sum of elements along the given axis. In the option list is an 
ndarray into which the result should be written. keepdims is 
a Boolean value, which if set to “True’, will keep the axis 
which is reduced a dimension with size one in the result. An 
example of a 3D array is given below. In this case, the axis 
takes values from 0 to 2. 


>>>a=n.array((((1,2,3),(4,5,6)),((7,8,9), (0,1,2)))) 
>>>a 
array( ([[4) 2; Sly 

[4, 5, 6]], 


[[7, 8, 9], 
[0, 4, 2]]]) 
>>>n.sum(a, axis=0) 
array([{[ 8, 10, 12], 


4, 6, 8)]) 
>>>n.sum(a, axis=1) 
array([[ 5, 4%. 9, 

[ 7. 9 2a) 


>>> n.sum(a, axis=2, keepdims=True) 
array([{[ 6], 
[15]], 


prod(a [, axis][,dtype][,out][,keepdims]): This returns the 
product of elements along the given axis. 

There are functions like argmax, min, argmin, ptp, clip, 
conj, round, trace, cumsum, mean, var, std, cumprod, all 
and any, which make scientific computations easier. There 
are functions for array conversion, shape manipulation, item 
selection and manipulation too. If one wishes to dig deep, 
please visit the official SciPy site. 


Operations on arrays 

Arithmetic operations: Arithmetic operators like ‘+’, ‘-’, ‘*’, 
‘? and ‘%’ can be applied directly on NumPy arrays. It is to 
be noted that all operations are element-wise operations. The 
result of 2D array multiplication is shown below: 


>>>c=n.array([[1,2],[3,4]]) 
>>>d=n.array([[1,3],[2,1]]) 
S>>e*d 

array([[1, 6], 


[6, 4]]) 


If you wish to perform matrix multiplication, use the 
function dot() as shown below or generate matrices using the 
matrix function and use the ‘*’ operator on them. 


>>>c=n.array([[1,2], [3,4]]) 
>>>d=n.array([[1,3], [2,1]]) 


ane 

array([[1, 2], 
[3, 4]]) 

Sh d 

array([[1, 3], 
[2, 1]]) 

>>> n.dot(c,d) 

array(|[ 5, Si, 
[t, 28))) 


Relational operations: NumPy allows one to compare two 
arrays using relational operators. The result will be a Boolean 
array, i.e., an element in a resultant array is set to “True’ only if 
the condition is satisfied. An example is shown below: 


>>>c=n.array([[1,2],[3,4]]) 
>>>d=n.array([[1,3], [2,1]]) 
S556 

array([[1, 2], 

[3, 4]]) 

>>>d 
array([[2, 3], 
[2, 1]]) 


Sohcaad 


array([[ True, False], 
[False, False]], dtype=bool) 


Logical operations: Logical operations can be performed 
on arrays using built-in functions supported by NumPy. 
Functions like logical_or(), logical_not(), logical_and(), etc 
can be used for this purpose. The code segment given below 
shows the results of the XOR operation. 


>>>c=n.array([[0,1], [ 
>>>d=n.array([[1,3],[ 
>>>n. logical_xor(c,d) 
] 
] 


y 


2,3]]) 
2,0]]) 
array([[ True, False], 
[False, True]], dtype=bool) 

Indexing and slicing arrays 
The NumPy array index starts at 0. Let ‘a’ be a 2D array. 
‘alil[j]’ represents the (j+1)th element in the (i+1)th row. 
Equivalently, you can write it as ‘a[i,j]’. ‘a[3,:]’ represents 
all elements in the 4th row. ‘a[i:i+2, :]’ represents all the 
elements in the (i+1)th row to the (i+3)rd row. 

I am now going to explain an attractive feature of NumPy 
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arrays, which is nothing but support for Boolean indexing. 
The example given below explains the same. 


>>>c=n.array([1,4,7,8,2]) 
>>>d=c<5 
>>>d 

array([ True, 
>>>c[d] 
array([1, 4, 2]) 


True, False, False, True], dtype=bool) 


Here, ‘d’ is a Boolean array whose element is set to “True’ 
if the corresponding element in ‘c’ has a value less than 5. 
Accessing array ‘c’ using ‘d’, i.e., c[d], will fetch an element 
in ‘c’ only if the element in the corresponding index position 
in ‘d’ is ‘True’. I will give one more example. An array ‘a’ is 
defined as follows: 


>>>a=n.array([1,2,3,4,5,6,7]) 


promoted to long. When it comes to Python 3, there is support for 
arbitrary precision integers. So there is no question of overflow in 
integer operations in pure Python. But we cannot restrict our use 
to pure Python, since scientific computation needs packages in 
the PyData stack (e.g., NumPy, Pandas, SciPy, etc). The PyData 
stack uses C type integers which have fixed precision. It uses 64 
bits for representation. So the maximum value an integer can take 
is 2-1. The overflow condition is shown below: 


>>>a=n.array([2**63-1,4],dtype=int) 


>>>a 

array ([9223372036854775807, 4]) 
>>>at1 

array ([-9223372036854775808, 5]) 


To conclude, NumPy not only makes computation easier, 
but also makes the program run faster. It provides multi- 
dimensional arrays and tools to play with arrays. ENDL GY 


We can see that the statement given below will retrieve all References 


elements in array ‘a’ which are even numbers. 


>>>a[a%2==0 ] 
array([2, 4, 6]) 


Integer overflow in Python 

Python 2 supports two types of integers: int and long. Int is C 
type, which allows a range of values to be taken, while long 

is arbitrary precision whose maximum value is limited by the 
available memory. If int is not enough, it will be automatically 


[1] wikipedia.org 

[2] scipy.org 

[3] www.quora.com/What-is-the-relationship-among- 
NumPy-SciPy-Pandas-and-Scikit-learn-and-when- 
should-I-use-each-one-of-them 
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It is one of the most popular Python ole 
"a LIFERAY. 


based CMS frameworks, so people 
who are familiar with Python can try 
it out. There are many websites built 
using this CMS. 

The official website for it is 
https://www.django-cms.org/en/. 


10. Liferay 
Liferay is a free and open source 
enterprise software product that is 
mainly used for corporate intranets and 
extranets. It has inbuilt CMS support 
and is written in Java. It has an open 
source and enterprise version. Liferay 
Portal CE is the open source version of 
Liferay’s enterprise Web platform. 

The official website for it is 
https://www.liferay.com/. ENDL @ 


Figure 10: Liferay 
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Top FOSS Tools 
that Assist Scholarly Research 


Free and open source software has a lot to offer research scholars and higher 
education students in carrying out their academic activities in an efficient 
manner. These tools assist in surveying the literature, managing references, 
drafting manuscripts and theses, making presentations, building graphical 
illustrations and in collaborating with peers. 


It incorporates various components that require high 

levels of cognitive skills. Similar to other aspects of 
life, the research domain has also benefited a lot from IT 
tools. Any one associated with academic research is well 
aware that usually, only a limited budget is allotted to IT 
tools. In such a scenario, getting commercial and proprietary 
IT tools is difficult from both the financial and procedural 
perspectives. Free and Open Source Software (FOSS) 
performing research tasks has been a big boon for academic 
researchers. Let me make it clear that the FOSS tools featured 
in this article have not been selected only because of near- 
zero costs, but also for their professional features which are 
on par with their commercial counterparts, if not better. 

This article attempts to highlight FOSS tools associated 

with various aspects of academic research, which 
are listed below: 
= Searching the literature and reference management 


S cholarly research is a dynamic and complex activity. 


www.OpenSourceForU.com 


« Drafting a manuscript 
* Building graphical illustrations 
" Networking and collaboration 
«Analysis tools 

For each one of the above mentioned requirements, there 
are umpteen FOSS tools. Exploring each of these tools is not 
feasible due to space limitations. Hence, this article introduces 
one or two leading tools for each requirement. 


Searching the literature and reference 
management 

One important task that’s repeatedly carried out by scholars 
is collecting information from existing literature. ‘Search — 
search again and re-search’ constitutes the major chunk of the 
work in academic research. By the term ‘searching’ I don’t 
only refer to the simple task of searching in a search engine 
as carried out by regular users. Here, ‘search’ implies the task 
of collecting information from various sources. Of course, 
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‘Literature Survey 
Analysis Tools 


_ Reference management 


Scholarly Research Dimensions » 


— 


Collaboration 
Drafting Manuscript 


Figure 1: Various aspects of scholarly academic research 


the first step in this process is searching in specialised search 
engines such as Google Scholar and Microsoft Academic. The 
crucial component is how to collect and organise the results 
returned by these search engines in response to the user query. 
To carry out this step, various tools are available, from which 
Zotero has been selected. 

Zotero is powerful software for collecting and managing 
references for your research. It can be installed in two 
different modes: 
= Standalone 
= Firefox Extension 

Zotero provides five major features to users: 


= Collect 

= Organise 
= Cite 

= Sync 


= Collaborate 
The Zotero workflow is as follows. 
= You navigate to the Results page of an academic search 
engine or any page that contains your research content. 
= Zotero comes loaded with various components that sense 
the presence of research content on a page. If sensed, 
automatically an icon with the Zotero logo is shown in 
the browser. By simply clicking on this icon, the metadata 
associated with that content is automatically added to your 
Zotero repository. 
= While you are writing a paper, you can directly insert 
this reference, by searching in the Zotero word processor 
connector plugins. These plugins can be installed either 
manually or automatically. 
The biggest advantage of Zotero is that changing the 
format of the citation style to another can be done with 
just a single click. For scholarly researchers, this task of 
changing the layout and citation style of a manuscript from 
one format to another is a laborious task. For example, if 
you prepare a manuscript with the IEEE citation style and 
now wish to convert that to Springer style, then you can 
do so by simple changing the Citation style settings in the 
Zotero word processor plugin. The Zotero style library 
has more than 8100 different styles. Zotero allows the 
creation of custom styles as well. 
Zotero provides the cloud syncing capability, which 
allows you to sync your collected research data across various 
systems. There is a detailed article about Zotero available at 


wo Collect 
Sync — ~ ; 
\ / 
| 
Zotero Features ~— Manage 
a \ 
Share —— \ 


hesnll Cite | 


Figure 2: Major features of Zotero 


http://opensourceforu.com/2015/01/speed-up-your-research- 
paper-with-zotero/ 


Drafting a manuscript 

Drafting a manuscript is another big task. Though any 
word processor can be used to draft manuscripts, there are 
some inherent limitations in this approach. As stated earlier, 
converting a manuscript from the style template of one 
journal to another is a task that cannot be automated if you 
simply go with the word processor approach. 

To rescue scholars from the burden of such laborious 
tasks, we have LaTex (usually pronounced as Lay-Tech). 
It is a typesetting system for creating production quality 
output. While the word processing tools follow the ‘What 
You See is What You Get? (WYSWYG) approach, LaTex 
uses an approach similar to markup languages such as 
HTML. Here, a clear demarcation is made between the 
content and its style. The authors can concentrate more on 
the content. Production of professional quality output, as 
per the style selected by the author, is carried out by the 
LaTex software. It is obvious that getting familiar with 
LaTex requires some initial extra work by the authors. 
However, it is worth going through this learning curve, as 
the benefits are huge. 

Shown below is a simple LaTex example: 


\documentclass{article} 
\title{Simple LaTex Article} 
\author{K S Kuppusamy} 
\date{01 Jan 2018} 
\begin{document} 

\maketitle 

Hello world! 

\end{document} 


The meaning of this LaTex code is given below: 

« This LaTex document belongs to the class called ‘Article’. 
There are other classes such as ‘Report’. 

" The title for this article is ‘Sample LaTex Article’. 

= The author’s name is ‘K S Kuppusamy’. 

"The date of this article is ‘01 Jan 2018’. 

= The main body content is ‘Hello world)’. 
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Figure 3: LaTex’s features 


It is easy to understand the source of the LaTex article. 
The major benefits of using LaTex are listed below. 
= It can be used to create journal articles, reports, books, 

theses, etc. LaTex is very good in handling larger sized 

content. With the traditional word processing tools, 
managing a 300-page document with many tables and 
figures can be tricky. Even a simple change in one 
place will have difficult-to-track cascading effects 

in various other places. In the case of LaTex, the 

professional rendering of content with proper alignment 

is now the task of the LaTex engine and authors need 
not worry much. 

= Managing linked references is very simple. For 
references, you can explore BibTex. 

= If your document consists of mathematical formulae, 
then LaTex should be your first choice. The rendering of 
formula and other scientific notations/symbols is handled 
excellently by LaTex. 

= LaTex offers support for managing many languages. For 
multi-lingual documents, LaTex can be very handy. 

= Incorporation of pictures/images and their alignment 
issues are handled professionally by LaTex. 

LaTex supports all major platforms. The LaTex 
installation instructions are available at https://www. 
latex-project.org/get/. If you don’t want to install it in your 
system, there are online tools available. Some of them are 
listed below: 
= Papeeria 
= Overleaf 
=  LaTexBase 

Online tools such as Overleaf make collaborative 
LaTex writing simple and effective. If you work as a team, 
then definitely it is worth giving them a try. The Beamer 
component of LaTex helps you to create presentations. As 
stated, for documents, the presentations created using Beamer 
would be perfect in terms of style and formatting consistency 
across all the slides. 


Building graphical illustrations 

Making good quality, meaningful graphical illustrations 
is another important task for scholars. There are many 
diagramming tools available for this purpose. This article 


Figure 4: yEd layouts 


introduces you to a tool named yEd (https://www.yworks. 
com/products/yed), which is freely available on all major 
operating systems such as Windows, GNU/Linux, Mac, 
etc. yEd allows you to build various types of diagrams 
such as: 
"  Flowcharts 
" Family trees 
= Semantic networks 
"Social networks 

A gallery of diagrams supported by yEd is available at 
https://www.yworks.com/products/yed/gallery. 

yEd can be either installed in your system or a live 
online version can be used. 

Automatic Layout is an important feature provided 
by yEd. The algorithms in yEd can be used to lay out the 
diagrams in various layouts by simply clicking a button. 
Some of the popular layouts are listed below: 
"Hierarchical 


"  Orthogona 
= Organic 

* Circular 

= Tree 


Networking and collaboration tools 

Networking is important for a good scholarly researcher. 
Information technology tools make it easy to communicate 
and collaborate with peers. In addition to the normal social 
networking tools used by everyone, there are some specific 
options for researchers to communicate and collaborate. A 
few of them are listed below. 

Academia (https://www.academia.edu): Academia 
is a free and simple way to share your research. One of 
the advantages of using this collaboration tool is that by 
reaching the right peer, there is a better possibility of 
increasing the citations. 

Researchgate (https://www.researchgate.net): 
Researchgate is a simple and effective platform to share 
your research papers and details about the projects that you 
are working on. It enables you to gather insights into your 
peers’ work at the global level, and get hints on the progress 
of their current research projects. 
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Analysis tools scope of this article; however, Table 1 features some of them. 
In addition to the general-purpose tools explained above, As stated at the beginning of the article, it would be 
there are some scientific computing and analysis tools. impossible to cover all the tools that enhance scholarly 
Enumerating the complete list of all such tools is out of the academic research. This article has just thrown some light 
on the various dimensions of academic research and the 
Table 1 . ' 
corresponding FOSS tools that can be used to improve 
Description research productivity. ENDL @ 
GNU PSPP This is a very popular tool for 
www.gnu.org/ statistical analysis and is available 
software/ospp completely free. The corresponding 
proprietary counterparts are costlier [1] Zotero: www.zotero.org 
and have a restriction on the num- [2] LaTex: www.latex-project.org 
ber of licences. [3] Overleaf: www.overleaf.com 
[4] yEd: www. yworks.com/yed/ 
SciLab This is a FOSS tool associated with 
http.://scilab.io scientific computing problems. 
https://www. learning algorithms for data mining ; , ; 

; : ; The author is an assistant professor of computer science 
cs.waikato.ac.nz/ | tasks. Its advantage is that it can at the School of Engineering and Technology, Pondicherry 
mi/weka/ be harnessed by researchers from Central University. He has 13+ years of teaching and 

non-computer science domains research experience in academia and industry. He has 
also, as it doesn’t require any cod- received the ‘Best Teacher Award! five times. He can be 
ing skills. reached at kskuppu@gmail.com. 
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Free and Open Source 
Tools for Bioinformatics and 
Molecular Biology 


The entry of open source tools in the life sciences arena has proven to be a boon. 
Open source tools can be used in the predictive and diagnostic fields to provide 
better medical treatment. Through their use in brain mapping and DNA studies, open 
source tools can even be used to combat crime. 
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owadays, the applications of information and 
N communications technology (ICT) are not limited to 
data transmission, cloud deployments, social media, 

Web servers and mobile applications. Since the last decade, 
IT is touching every area of the social and corporate world 
including health and medical sciences. Most of the medical 
diagnosis laboratories are now equipped with advanced 
computerised machines to accurately diagnose and fetch the 
parameters of the human body. These diagnostic machines 
include those used for magnetic resonance imaging (MRI), 
computed tomography (CT), electroencephalography (EEG), 
etc. These systems provide a higher degree of accuracy in the 
analysis of the human body, assisting doctors in diagnosing the 
disease and thus recommending a suitable course of treatment. 

In addition to diagnostic machines, software tools and 
libraries are also used. These software tools and applications 
evaluate the biological data collected from the computerised 
diagnostic machines. Thus, the concept of bioinformatics 
has evolved, which uses software tools and applications to 
understand the biological and medical data. These software 
suites make use of high performance programming languages 


at the back-end to process and evaluate the biological data set, 


leading to effective treatment. 

Bioinformatics is the interdisciplinary area that integrates 
biology, computer science, mathematics, engineering, 
chemistry and statistics for advanced predictions and analytics. 
The field of molecular biology is also closely associated with 
bioinformatics for accurate analysis of biological structures. 
Molecular biology deals with the deep analysis of the 
bimolecular movements in the cells of the body along with the 
details of proteins, DNA, RNA and biosynthesis. 
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Data sets for research in medicine and biology 
With the deployment of computerised machines, researchers 
in diagnostic and medical sciences are taking assistance from 
software professionals in their field so that the programming 
modules can be processed by these developers. Even 
computer scientists are now taking the interdisciplinary field 
of bioinformatics for their research so that their programming 
knowledge can be utilised for the health sciences. 

There are numerous medical data sets available for 
research, which are released by the diagnostic laboratories 
so that the overall architecture and structure of medico- 
biological data can be analysed by software experts. The 
programmers working in bioinformatics can download these 
medical data sets and they can perform the analysis using 
effective algorithms. 

The software tools that can be used for the analysis and 
evaluation of medical data for specific types of data sets are 
summarised below. 

OpenEEG (hitp://openeeg.sourceforge.net/doc/) 

OpenEEG is free and open source software that can be 
used for EEG signal analysis with numerous libraries as add- 
ons, including Neuroserver, BioEra, BrainBay, Brainathlon, 
BrainWave Viewer and EEGMIR. 

EEGNET (https://sites.google.com/site/eegnetworks/) 
This is a free and open source tool for the analysis and 
visualisation of EEG brain signals. It has features to visualise 

the brain network. 

BioSig (http://biosig.sourceforge.net/) 

BioSig is a software library under free and open source 
distribution with many features of biomedical signal 
processing. This library has excellent features to process 
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biosignals including 
electrocorticogram (ECoG), 
electromyogram (EMG), 
electrocardiogram (ECG), 
electrooculogram (EOG), 
electroencephalogram 
(EEG), respiration and 
many others. In addition, 
the interfacing toolboxes 
and drivers for Octave, 
MATLAB, Python, PHP, 
Perl, Ruby, Tcl, C and 
C++ are also available. The key areas of brain-computer 
interfaces, psychology, neuroinformatics, cardiovascular 
systems, neurophysiology and sleep research are effectively 
processed in BioSig. 

GenomeTools (http://genometools.org/) 

GenomeTools is open source software for the analysis 
of genome and biological parameters. It has a free library of 
tools for bioinformatics. The APIs in C are available with 
detailed manuals. In addition, the deep analysis of biological 
structures is integrated in GenomeTools. 


Figure 1: Elements of bioinformatics 


Working with BioPython for molecular 
biology (http://biopython.org/) 
BioPython provides the set of tools and libraries for the 
analysis and computation of biological structures. It is 
available for free and open source distribution and is 
promoted by the Open Bioinformatics Foundation (OBF). It 
can parse the files of bioinformatics into the data structures 
that can be processed by Python code. 

The following international formats are supported in 


BioPython: 
= UniGene 
= PubMed 
= GenBank 
= Medline 
= GenBank 
= FASTA 
= Clustalw 
= Blast 
Use the following command to install 
BioPython in Ubuntu: 


$ sudo apt-get install python-biopython 


The following command will install the documentation 
along with BioPython: 


$ sudo apt-get install python-biopython-doc 
BioSQL (http://biosql.org) can be used with BioPython 


to store a biological database. To integrate BioSQL, the 
following instruction is executed: 


Download 


biopython 


Current Release - 1.70 - 10 July 2017 


Files 


Biopython 1.70 


Figure 2: BioPython library for molecular biology 
$ sudo apt-get install python-biopython-sql 


Sequence is the key object in bioinformatics. Sequences can 
be processed in BioPython with the following instructions: 


>>> from Bio.Seq import Seq 

>>> my_seq = Seq(“MyDefinedSequence” ) 
>>> my_seq 

Seq(’ MyDefinedSequence ‘, Alphabet()) 
>>> print (my_seq) 

MyDefinedSequence 

>>> my_seq.alphabet 

Alphabet ( ) 


Complement and reverse complement 
These are very simple — the methods return a new Seq object with 
the appropriate sequence and the same alphabet, as shown below: 


>>> from Bio.Seq import Seq 
>>> from Bio.Alphabet import generic_dna 


>>> my_values_dna = Seq(“MY_VALUES_DNA”, generic_dna) 
>>> my_values_dna 
Seq(‘MY_VALUES_DNA’, DNAAlphabet()) 


>>> my_values_dna.complement( ) 
Seq( ‘ATCATGTGACCA’, DNAAlphabet ()) 


>>> my_values_dna.reverse_complement ( ) 
Seq(‘ACCAGTGTACTA’, DNAALphabet()) 


Transcription functions on DNA and RNA 

If you have a DNA sequence, you may want to turn it into RNA. 
In bioinformatics we normally assume the DNA is the coding 
strand (not the template strand); so this is a simple matter of 
replacing all the thymines with uracil: 


>>> my_values_dna 
Seq(‘MY_VALUES_DNA’, DNAAlphabet()) 


>>> my_values_dna.transcribe( ) 
Seq(‘AGUACACUGGU’, RNAALphabet()) 
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With the specification of RNA, the associated 
DNA can be fetched: 


>>> my_values_rna = my_values_dna.transcribe( ) 


>>> my_values_rna 
Seq(‘AGUACACUGGU’, RNAAlphabet()) 


>>> my_values_rna.back_transcribe( ) 
Seq(‘MY_VALUES_DNA’, DNAAlphabet()) 


>>> my_values_rna 
Seq(‘AGUACACUGGU’, RNAAlphabet()) 


>>> my_values_rna.back_transcribe().reverse_complement ( ) 
Seq( ‘ACCAGTGTACT’, DNAAlphabet()) 


Sleep EEG analysis in GNU Octave 

Assorted signals are delivered to all parts of the body so 
that the other organs can communicate with each other 
for specific or general purposes. One of the key signals 
in the human brain is electroencephalography (EEG), 
which is generated from the brain, even when asleep or 
unconscious. Electroencephalography (EEG) signals 
comprise brain waves that can be evaluated using GNU 
Octave. The analysis on sleeping disorders and various 
diseases can be done with EEG evaluation. 

GNU Octave (https://www.gnu.org/software/ 
octave/) is one of the powerful and multi-functional 
tools used for engineering and scientific applications of 
research. The simulations related to engineering as well 
as medicine can be implemented with the assorted tool 
boxes and functions in Octave. It is used as an effective 
alternate to MATLAB since it is open source and can be 
freely distributed. A number of tool boxes for different 
applications are available in GNU Octave, which can be 
used for optimisation and predictive analysis. 

The Wave Form Database (WFDB) package can be 
integrated with GNU Octave. This package is equipped 
with the functions and modules for EEG and brain 
signal evaluations. A similar process is followed in case 
of brain mapping or brain fingerprinting for criminal 
investigation when the subject is in an unconscious 
state. There are assorted stages of sleep or unconscious 
states which can be analysed from EEG signals after 


recording from the electrodes. This process assists in the 


forensic analysis of the person while in the unconscious 
state. By this evaluation, the medical disorders can also 
be detected using the WFDB package in Octave. The 
following are the excerpts of Benchmark Sleep Stages 
which can be evaluated using the WFDB package in 
GNU Octave so that the overall state of the nervous 
system can be evaluated and predictions made, along 
with diagnosing brain disorders. 
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Figure 3: Viewing EEG signals in the WFDB tool box 


Stage 1: Tiredness, drowsiness, the pre-sleep stage and lethargy 
= Eye activities 
" Rolling eye movements 
= Sharp transients 
Stage 2: Normal night sleep 
= Sleep spindles 
" Slow eye movement 
Stage 3: Delta sleep or slow wave sleep 
= Sleep time of 6.5 hours 
With the following instruction, the demonstration of the 
WEDB tool box can be viewed in Octave. 


>> wfdbdemo 


The following instructions can be executed to read and plot 
the ECG signal from the data set repository of PhysioBank: 


[time, signal]=rdsamp(‘mitdb/100’,1); 
plot (time, signal); 


(Source: https:/www.physionet.org/physiotools/matlab/ 
wfdb-app-matlab/) 

Using similar methodology, the waveform of arterial blood 
pressure (ABP) can be analysed using the wabp function. 


Scope of research in biomedical engineering 
Nowadays, bioinformatics and biomedical predictive analytics 
are two key domains of research for assorted applications. 
The extraction, processing and predictive mining from the 
brain, heart and other human body generated signals are 
evaluated with the use of information technology. The data 
sets from Physionet, UCSD, FPMS and others can be used 

for the research work in bioinformatics with the integration of 
data mining and machine learning tools. ENDL @ 


_ 
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Top Ten Open Source Tools for 
Mathematicians 


It’s that time of the year when lists of all kinds are being made. So why not 

a list of open source tools to be used by mathematicians? The authors of 

this article have sifted through a number of tools and have come up with a 
selection that will be of immense use to mathematicians. 


ver the past many years, we have gone through a 
O large number of ‘Top Ten’ lists on various topics, 

both technical and non-technical. Some of them are 
quite interesting and others boring to the core. For example, 
‘The top ten Linux distributions of all time’ is definitely an 
interesting read, whereas “The top ten things to do with a 
glass of water’ is downright useless, if not boring. Yes, the 
Internet is full of interesting, boring and stupid Top Ten lists, 
but we never came across a list enumerating open source 
tools used by mathematicians—professionals as well as 
academicians. So, we are listing out the top ten open source 
tools that are much used by mathematicians. 


be 

The utility called be (basic calculator) is an arbitrary- 
precision calculator language. This utility is often called the 
bench calculator and can be invoked with the command bc 
on a terminal. The two important versions of bc currently 
used are POSIX bec and GNU bc. GNU bc is available as 
free and open source software licensed under the GNU 
General Public License. The decision to include a tool like 
bc whose development started in 1975 in such a list might 
be a bit controversial. Of course, there are far more powerful 
tools available to choose from. Yet, we chose it because, 
first, it is an arbitrary-precision calculator limited only by 
the available memory of the host system and not limited to 
8, 16, 32 or 64 bits of precision. This, really, is an important 


thing and if you have any doubts, ask those programmers 
working in the field of scientific computing, who are 
plagued by the proper representation of floating-point 
numbers and how difficult it is to tame them. 

Another reason is that though it looks very simple, bc 
is a Turing complete language, making it as powerful as 
programming languages like C and Java—at least in theory. 
The final reason for choosing bc for this list is that you don’t 
need a gun to kill a fly. On many occasions, it is a case of 
overkill to use powerful tools like Scilab or SageMath for 
simple numeric computations. Figure 1 shows the output of a 
division operation using bc with various levels of precision. 


Scilab 

Scilab is an open source, cross-platform numerical 
computational too] and a numerical programming language 
licensed under the GNU General Public License. It 

can be used for signal processing, statistical analysis, 

image processing, fluid dynamics simulations, numerical 
optimisation, etc, often with the help of toolboxes like Image 
Processing Toolbox, Wavelet Toolbox, etc. Scilab is an open 
source alternative to MATLAB, another very popular but 
proprietary numerical computing environment. The two 
tools resemble each other to some extent, so that often a 
person skilled in one will find it easy to migrate to the other. 
A comparison of Scilab and MATLAB is clearly out of the 
scope of this article, but there is at least one area where 
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Figure 1: be in action 


Scilab clearly outperforms MATLAB, and that is, price! You 
absolutely don’t have to pay anything to use Scilab, whereas 
MATLAB is a relatively costly scientific tool. Scilab can 

be used as a programming environment and an interactive 
mathematical shell. Figure 2 shows how Scilab works as an 
interactive mathematical shell. 


GNU Octave 


GNU Octave is another important mathematical tool primarily 
used for numerical computations. It is part of the GNU Project 
and is offered as free and open source software licensed under 
the GNU General Public License. GNU Octave is one of the 
two very powerful open source alternatives to MATLAB, 

the other being Scilab. But GNU Octave gives far more 
importance to syntactic compatibility with MATLAB than 
Scilab does. To prove our point, we have copied and executed 
the following MATLAB code from the Wikipedia article on 
MATLAB, on both Scilab and GNU Octave. 


[X,Y] = meshgrid(-10:0.25:10, -10:0.25:10); 
f = sinc(sqrt((X/pi).%2+(Y/pi).42)); 

SuUrT (OG Te 

axis([-10 10 -10 10 -0.3 1]) 
xlabel(‘{\bfx}’) 

ylabel(‘{\bfy}’) 

zlabel(‘{\bfsinc} ({\bfR})’) 


Scilab failed to produce an output whereas GNU Octave 
successfully executed the program. The output obtained 
from GNU Octave is the same as the output obtained from 
MATLAB. The output is shown in Figure 3, and you can 
compare it with the figure given in the Wikipedia article on 
MATLAB to fully comprehend the syntactic similarity of 
GNU Octave and MATLAB. GNU Octave can also be used as 
a programming environment and an interactive shell. 


Maxima 

Maxima is a computer algebra system that is widely used in 
the fields of mathematics like algebra, calculus, etc. Maxima 
is developed using Lisp, and it is a cross-platform tool that 
runs on UNIX, Linux, Windows, Android, MacOS, etc. It 

is free and open source software licensed under the GNU 
General Public License. Due to the complexity involved 

in developing computer algebra systems, there are only 

a few powerful tools like Maxima and SageMath. A very 


Scilab 6.0.0-beta-2 Console 


Startup execution: 
loading initial environment 


<-> 1/3 
ans = 


0. 3333333 


Figure 2: A simple calculation in Scilab Figure 3: Output from GNU Octave 


Figure 4: Integration using Maxima 


popular graphical user interface (GUI) for Maxima is called 
wxMaxima, and it can be used as a programming environment 
and an interactive shell. Figure 4 shows how Maxima can be 
used to perform integration. I have used a simple function 

to perform integration, but you can also find the integral of 
complex functions using Maxima. 


SageMath 


SageMath is a computer algebra system that has widespread 
use in various fields of mathematics like algebra, 
combinatorics, graph theory, 
numerical analysis, number 
theory, calculus, statistics, etc. 
Figure 5: The logo of SageMath Computer algebra systems 
are mostly used for symbolic 
computing, rather than numeric computing and number 
crunching. The large number of packages necessary for a 
powerful computer algebra system makes them very rare and 
due to this reason, SageMath is a very important tool. It is a 
free and open source tool licensed under GNU General Public 
License, and is a competitor to many other mathematical 
tools like Maple, Mathematicia, MATLAB, etc, which are 
proprietary software. SageMath is not only competent enough 
to replace any of these tools but also absolutely free. It can 
be used in two modes, as a programming environment and 
an interactive shell. The browser-based interactive shell 
called Notebook has the ability to remember previous inputs 
and outputs for review and reuse. SageMath is a relatively 
new tool with most of the development being done using 
Python; the initial release came out in 2005. Figure 5 
shows the logo of SageMath. 


R 


R is a mathematical tool mainly used for statistical 
computing. It is used by mathematicians in general and by 
statisticians in particular for data mining and developing 
statistical software. The development of R is supervised by 
the R Foundation for Statistical Computing. R is a cross- 
platform tool that can be used in various operating systems 
like Linux, Microsoft Windows, etc. R is free and open source 
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Figure 6: Plotting with R Figure 7: Logo of Isabelle 


In [1]: from scipy.special import factorial 


In [2]: factorial(100, exact=True) 

Out[2]: 93326215443944152681699238856 2667004 
907 1596826438 1621468592963895217599993229915 
608941463976 156518286 2536979 208 27 22375825118 
5210916864000000000000000000000000L 


Figure 8: Factorial using SciPy 


software licensed under the GNU General Public License. 

It is often used as an open source alternative to proprietary 
statistical packages like SAS, SPSS, etc. But R being freely 
available is not the main reason for its widespread adoption 
by the statistical community; rather, its popularity is based 
on its power. R can be used as a command line utility as well 
as with a graphical user interface (GUI). A very popular GUI 
for R is called the RStudio. R is also capable of performing 
various graphical techniques. Figure 6 shows the output of the 
simple R commands ‘var1 <- c(1, 3, 5, 4, 8)’ and ‘plot(var1, 
type=”o”, col=”red”)’ when executed on the console. 


Isabelle 

While choosing ten useful mathematical tools for this article, 
we made sure that what we selected is used in diverse fields of 
mathematics, rather than choosing ten popular software from 
a single field, like numerical computations, for instance. So, 
this tool may not be as popular as many other mathematical 
this tools left out from this list. Let us clarify further—the 
general public is familiar with numerical computing tools like 
MATLAB, Scilab, etc, but may not be aware of SageMath 
or Maxima because the latter two are mostly used in the 
more abstract areas of mathematics like combinatorics, 
graph theory, number theory, etc. Mathematics itself is often 
considered an abstract subject — sometimes even by famous 
mathematicians. G.H. Hardy once said, “We have concluded 
that the trivial mathematics is, on the whole, useful, and that 
the real mathematics, on the whole, is not.” There are certain 
fields in mathematics which are deemed abstract even by the 
standards of professional mathematicians; one such field is 
logic and automatic theorem-proving. Not many people are 
interested in automatic-theorem proving, but for the select 
few who work in this area, a tool to aid their quest will be 
like a gift from heaven. So, with that rationale, we introduce 
Isabelle, an interactive theorem-proving software. It is free, 
open source and available for use under the BSD licence. 
Isabelle has been developed by using two programming 


languages, ML and Scala. It can be used to encode first-order 
logic, higher-order logic, etc, for further processing. Figure 7 
shows the logo of Isabelle. 


SciPy 

SciPy is an open source Python library used for scientific 
computing. It contains modules for optimisation, linear 
algebra, calculus, interpolation, Fast Fourier Transforms, 
signal processing, image processing, etc. SciPy is part of 

the NumPy stack, which also includes tools like Matplotlib, 
SymPy, etc. Matplotlib is a plotting library and SymPy is a 
library for symbolic computing in Python. The SciPy library 
is available under the BSD licence. Figure 8 shows how SciPy 
is used to find the factorial of 100. Even in a modest computer 
system, we were able to find the factorial of 100,000. But 

we will not dare to show the output in this article because 

the resulting number is 456574 digits long. The December 
2017 issue of OSFY had 108 pages, including the covers. 

The page we randomly chose in that issue had 54 lines of 

text and the line we randomly chose had 123 characters, 

so, a simple calculation will tell you that it will take at least 
68 pages of the next issue of OSFY to print this number. 

This tells us about the power of SciPy. And with a powerful 
computer, you could do wonders with SciPy. 


gnuplot 

Scientific research involves representing data in a neat and 
concise manner. Graphs are often used to do this. It’s for this 
purpose that we can use gnuplot, a command line program 
that can plot 2D and 3D graphs of functions and data. It 

is frequently used in vector graphics. gnuplot can produce 
output in many different image formats like Portable 
Network Graphics (PNG), Encapsulated PostScript (EPS), 
Scalable Vector Graphics (SVG), Joint Photographic Experts 
Group (JPEG), etc. It is a cross-platform utility that runs on 
a variety of operating systems like UNIX, Linux, Microsoft 
Windows, MacOS, etc. Even though the name of the 
software starts with ‘gnu’, it is not part of the GNU project. 
In fact, the inclusion of gnuplot in this list can be contested 
because, in a very strict sense, it is not free and open source 
software because it is licensed under the gnuplot licence, 
which gives users the right to modify the source code but 
the right to distribute modified versions is withheld. But the 
extreme popularity of gnuplot made us select it rather than 
a free and open source graph plotting utility like xgraph. 
Figure 9 shows a 3D graph plotted using gnuplot. 


LaTeX 

Finally, let us think about the life of a brilliant 
mathematician who has done a lot of research —with 

and without using some of the mathematical tools we 
have listed just now—to come up with some excellent 
theorems. It’s eventually time to write down these results 
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An loT solution is an amalgamation of hardware, software and networking 
capabilities. Edge computing is required for any successful and optimised loT 
solution. There are various open source frameworks available for edge computing in 
the loT space. This article introduces some of them. 


immediately processed. An IoT system can generate 

limitless data within a second, depending on the 
various IoT devices assembled under the system, as per 
business needs. Limitless data generated from IoT devices or 
sources can easily consume network bandwidth and lead to 
a need for excess data storage. It is crucial to aggregate and 
digitise the data at the periphery of the system, which can then 
be communicated to back-end systems. This responsibility 
is taken care of by edge computing, and helps to reduce or 
optimise the IT infrastructure. These edge computing systems 
reside close to the IoT devices/data sources and also enforce 
the required security. A major benefit of edge computing is 
that it improves the time to action, reduces response time and 
optimises the use of network resources. It also helps to reduce 
latency and network bottlenecks. 


D ata is the heart of an IoT system and it needs to be 


101010103010 


Reference architecture of edge computing 
for loT solutions 


loT devices and data sources 

An IoT system can have various data sources—IP 

capable or low-powered devices like sensors, appliances, 
applications, social media sites, or data from third party 
systems. These data sources can generate data in various 
formats, frequencies and volumes. This is the layer where 
data is getting generated. Data sources may vary as per an 
enterprise or industry’s needs. A solution should support 
the data source channels required to meet a business’ needs. 
These IoT devices will capture the data and communicate 
over IoT protocols to a nearby edge gateway system via 
Wi-Fi, Ethernet, Bluetooth, NFC, Zigbee or any other 
communication/transport layer protocols. 
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Figure 1: Reference architecture for edge computing 


loT communication/transport layer protocols 

The following are some of the widely used IoT 

communication/transport layer protocols. 

= Bluetooth: This is a wireless technology for data exchange 
between electronic devices over short distances. An 
electronic device must meet Bluetooth support standards to 
enable communication over Bluetooth protocols. 

= Wi-Fi: This is a wireless local area networking technology. 
It is a very common and easy-to-set up communication/ 
transport layer. Any electronic devices within range of a 
wireless modem can attempt to access the network. 

= NEC: Near Field Communication (NFC) is a protocol to 
enable two electronic devices to establish communication. 
It is used for contactless payments, electronic tickets, 
mobile payments, sharing contacts, photos, videos, files, 
etc. It offers simple and safe two-way interaction between 
electronic devices within a specified range. 

= Zigbee: This is a high-level communication protocol for 
personal area networks. Zigbee is a low power, low data 
rate and close proximity wireless network. It is simpler 
and less expensive than Bluetooth and Wi-Fi. It is used for 
home automation, medical devices, traffic management 
systems and other consumer or industrial low power, low 
bandwidth requirements for small scale projects. 

= LoRaWAN: Low Power Wide Area Network (LPWAN/ 
LoRaWAN) operates in the radio spectrum for wide area 
networks. Similar to Wi-Fi, it can be set up for using 
lower radio frequencies with a longer range. It allows 
low-powered devices to communicate with Internet 
connected applications. 
There are some other transport layer protocols too, 

such as Z-Wave, 6LowPAN, Thread, Cellular, Ethernet, 

Eddystone, WiMax, etc. 


loT data protocols 
Data protocols are a set of rules for establishing 
communication between various entities of the system. These 
protocols define the syntax, semantics, synchronisation of 
data, and provision for error recovery. The following are some 
of the widely used IoT data protocols. 
= MQTT: Message Queue Telemetry Transport (MQTT) 
is designed to provide embedded connectivity between 
applications and middleware on the one side, and networks 
and communications on the other. It follows a publish/ 


subscribe architecture, where the system consists of three 
main components: publishers, subscribers and a broker. 

= AMOQOP: The Advanced Message Queuing Protocol 
(AMQP) runs over TCP and provides a publish/subscribe 
architecture that is similar to MQTT. The difference is 
that the broker is divided into two main components: the 
exchange and queues. The exchange is responsible for 
receiving publisher messages and distributing them to 
queues based on predefined roles and conditions. Queues 
basically represent the topics, and its subscribers get the 
sensory data whenever it is available in the queue. 

= CoAP: The Constrained Application Protocol (CoAP) is for 
constrained devices called nodes to communicate with the 
wider Internet using the same protocols. It is designed to be 
used under the same constrained communication network 
between devices and nodes on the Internet. Multi-cast, low 
overheads and simplicity are important features of CoAP. 

" HTTP: This is the standard protocol for Web services and 
is used in IoT solutions. The most popular architectural 
style, called RESTful, is widely used on mobile and Web 
applications, and is being considered for IoT solutions. 

There are some other data communication protocols such as: 

= Mosquitto: An open source MQTT broker 

= XMPP (Extensible Messaging and Present Protocol) 

« DDS (Data Distribution Service for real-time systems) 

= LLAP (Lightweight Local Automation Protocol) 

= LWMO2M (Lightweight M2M) 

= SSI (Simple Sensor Interface) 


The edge gateway and middleware gateway 
Edge gateways process initial accumulated data that is 
received from IoT devices/sources and convert that into 

an expected format. They share the accumulated data 

from IoT devices to the middleware gateway, which is 

an IoT and API gateway. The IoT gateway empowers the 
system to have bi-directional communication between IoT 
devices and storage/analytics systems. It helps to regulate 
the environmental changes and detect possible issues 

with the functioning of a system. These gateways protect 
information moving in both directions, as well as prevent 
unauthorised control of IoT devices from the outside world. 
They also facilitate the device’s life cycle management. 
Figure 2 depicts the various stages of this life cycle and the 
respective features for device management. 
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Figure 2: Life cycle management of a device 


100 | FEBRUARY 2018 | OPEN SOURCE FOR YOU | www.OpenSourceForU.com 


Application interfaces 
An IoT solution consists of some Web application interfaces 
like the Admin Information Dashboard or Cockpit. This 
Cockpit provides the interface to administer the IoT devices, 
as well as the configurations to manage and control the IoT 
devices. There can be multiple Web application interfaces. 
One can be at each deployment unit, which represents the 
edge computation at the periphery of the system. There 
can be other interfaces that manage the entire system’s 
configurations from a central portal. 
The Cockpit can be implemented using front-end 
technologies like Angular, Node, etc. 
= Angular: This is a JavaScript based open source, 
front-end Web application framework. It supports the 
model-view-controller (MVC) and the model-view-view- 
model (MVVM) architectures. It aims to simplify the 
development and testing of single page Web applications. 
= Node: This is an open source, cross-platform, JavaScript 
runtime environment for executing JavaScript code on 
the server side. It has an event-driven architecture and its 
design choices aim to optimise throughput and scalability 
in the Web applications. 


Back-end platform and the open source 

technologies used 

An IoT solution’s back-end system has the following functions. 

= Ingestion: An ingestion framework is needed in the 
solutions approach to extract data from various data 
sources and send it to the processing tools. Data 
can be ingested as a stream in real-time or ingested 
on batch processing, depending on the business’ 
needs. An ingestion of data to the system for further 
processing requires various supportive frameworks 
based on other technology stacks of the system. The 
data input to the system can be from Hadoop data 
clusters, SQL data exports or data ingestion to the 
messaging server like Kafka, or other data stream 
processing frameworks/tools like Apache Camel, 
Spark Streaming, Storm and Flume. 

= Orchestration and processing: This is the layer in 
which the alignment of business needs to applications, 
data and infrastructure takes place. System orchestration 
and automation are important layers of the entire 
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Figure 3: Reference architecture for cloud services 


solution. They define various rules, policies, business 
logic, automated workflows, provisioning, change 
management, etc. The entire system’s scalability 
depends a lot on this layer’s scalability and extensibility 
provisioning capabilities. There are many frameworks 
or tools that can be used for system orchestration and 
processing in enterprises with large data, like NiFi, 
Oozie and Apache Spark. 

Modelling and analytics: Data modelling for 
organisational data can span multiple levels of abstraction, 
ranging from conceptual to logical and physical. The 
conceptual models can be used for discussions with 
business people and domain experts. The logical model 
adds more precision, and provides the information 

to discuss and decide on logical representations. The 
physical model relies on various technology-specific 
data, and helps to prepare a target environment, such 

as a database management system. There are various 
frameworks or technologies available for data modelling 
and the respective analytics, like R, Python, Spark ML, 
Kibana, Elastic Search and TensorFlow. 

Data storage: Data storage components are the core of 
any solutions approach. From raw data to mission-critical 
records, the choice of storage can have a profound impact 
on the capacity, performance, long-term reliability and 
durability of any storage infrastructure. The system should 
have failover, backup and disaster recovery mechanisms. 
Data generated from various data sources may be 
structured, unstructured or semi-structured. Solutions 
should have a provision to process, format and message 
(if needed), and then provide the relevant storage options 
like SQL, NOSQL, a data warehouse, etc. The data 
storage platform depends on the nature of the data and 
the business needs. Some of the open source data storage 
options are MongoDB, MySQL, PostgreSQL, Redis, 
Couch DB, HBase, and Cassandra. 


Reference architecture of edge 

computing for loT solutions with the 
cloud technology stack 

Cloud service providers have rich sets of services and 
technology options for IoT applications and edge computing. 
Considering the leading cloud service providers as Azure and 
AWS, Figure 3 depicts the technology choices at various 
layers of reference architecture. 


The following are the cloud technologies and services 


available. 


Azure IoT Hub: This is a fully managed service 
that enables reliable and secure bi-directional 
communication between millions of IoT devices and 
a solutions back-end. It provides multiple device-to- 
cloud and cloud-to-device communication options. 
These options include one-way messaging, file 
transfers and request-reply methods. It has built-in 
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declarative message routing to other Azure services, 
and provides extensive monitoring for device 
connectivity and device identity management events. 

= Azure API Management: This is a solution for 
publishing APIs to external and internal customers. It 
quickly creates consistent and modern API gateways for 
existing back-end services hosted anywhere; it secures 
and protects them from abuse and overuse and gets 
insights into usage and health. Azure API Management 
provides the core competencies to ensure a successful 
API program through developer engagement, business 
insights, analytics, security and protection. 

= Azure IoT Edge: This is an open source solution from 
Microsoft Azure for edge computing. If offers device 
management, compute, analytics, operating with offline 
comnectivity and real-time decision making at the edge 
of an IoT solution. And it helps to reduce the bandwidth 
costs incurred for data communication to the cloud from 
the edge. It is a service that delivers cloud capabilities 
to the edge. Azure IoT Edge provides easy orchestration 
between code and services, so they flow securely between 
the cloud and the edge to distribute intelligence across IoT 
devices. It easily integrates Microsoft Azure and third- 
party services or augments existing services to create a 
custom IoT application with business logic. It helps to 
put intelligence into devices which can act locally, based 
on the data they generate, while also taking advantage of 
the cloud to configure, deploy and manage these devices 
securely and at scale. 

= AWS IoT: This is a managed cloud platform that lets 
connected devices easily and securely interact with cloud 
applications and other devices. It can support billions of 
devices and trillions of messages. It can process and route 
those messages to AWS endpoints and to other devices 
reliably and securely. With AWS, IoT applications can 
keep track of and communicate with all devices, all the 
time, even when they aren’t connected. 

=» AWS API Gateway: This is a fully managed service that 
makes it easy for developers to create, publish, maintain, 
monitor and secure APIs at any scale. With a few 
clicks in the AWS Management Console, the user can 
create an API that acts as a ‘front door’ for applications 
to access data, business logic, or functionality from 
back-end services. It handles all the tasks involved in 
accepting and processing up to hundreds of thousands 
of concurrent API calls, including traffic management, 
authorisation and access control, monitoring, and API 
version management. 

= AWS Greengrass: This is a service from AWS for edge 
computing. It offers local compute, messaging, data 
caching, sync and machine learning capabilities for 
connected devices. It ensures quick responses to local 
events and actions along with local storage, and helps to 
minimise the cost of transmitting IoT data to the cloud. 


Admin information dashboard 

Both the leading cloud service providers facilitate 
administrative consoles to configure, operate, view and 
manage the IoT devices and data protocols. IoT solutions 
powered by cloud services can also use these consoles. 


Other loT tools and services 

Some of the other IoT specific tools, technologies or services 
from Azure/AWS are AWS IoT SDK, AWS IoT Registry, 
AWS IoT Security, AWS Rule Engine, AWS Device Shadows, 
Azure IoT Device SDK, Azure IoT Protocol Gateway, Azure 
AD and Azure Device Provisioning. 


Back-end technologies 

An end-to-end IoT solution can use many other services from 
cloud service providers like Azure Data Factory, Azure Stream 
Analytics, Azure Storage, Azure Cosmos DB, AWS Kinesis, 
AWS Data Pipeline, AWS SQS, AWS DynamoDB and AWS S3. 


[=I Note: Do refer to the AWS and Azure sites for the latest 


updates on their respective service details. 


Reference architecture of edge computing 
for loT solutions with the open source 
technology stack 

The following are the technologies/frameworks available. 

* Kura: Kura is a Java/OSGi-based open source framework 

for IoT systems. It has support to access the underlying 
hardware like serial ports, GPS, watchdog, etc. It 
supports the management of IoT devices, including 
configurations and communication. It provides M2M/ 
IoT integration platforms and gateway management. Kura 
also provides various APIs, interfaces and capabilities for 
communication, connectivity, network management, data 
management, messaging, remote management, etc. 

"  Kapua: This is an IoT platform to manage and integrate 
devices and their data. It provides an integration 
framework and other features like a device registry, device 
management, messaging services, data management and 
application enablement. 

" Spring Boot: This is a spring framework for the 
development of REST APIs. It facilitates the creation of 
standalone, production-grade spring based applications. It 
works with minimum spring configurations. 


Figure 4: Reference architecture for open source 
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Other loT tools and frameworks 

There are many other open source IoT tools and frameworks, 

as listed below. 

= Eclipse Hono: This tool works for connectivity and 
protocol support. It provides various APIs for device 
communication using arbitrary protocols and also 
supports protocol customisation. 

= Kaa: This open source platform is for implementing 
end-to-end use cases for IoT solutions. It offers various 
features such as device management, data collection, 
configuration management, messaging, events, 
notifications, end points, etc. 

«  ThingsBoard: This open source IoT platform 
facilitates data collection, processing, visualisation, 
device management, a rule engine, asset management, 
customisation and integration. 

Some other IoT open source projects are OpenloT, 

DeviceHive, Node-RED, IoTivity, Mango and OpenThread. 
An IoT solution is an amalgamation of hardware, 

software and networking capabilities. Emerging 

technologies and IoT devices will lead to the creation of 
more IoT applications for both consumers and industrial 


use cases. Edge computing is required for any successful 
and optimised IoT solution. There are various open source 
frameworks available in the IoT space. Cloud service 
providers are also providing very rich services for IloT 
solutions and edge computing. Recommended guidelines 
for IoT solutions are: loosely coupled, modular, platform- 
independent, and based on open standards. am} 
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Figure 9: Graph plotted using gnuplot 


to bring out a good scientific paper. Do remember that 
mathematics is full of symbols like m, V, %, etc, which 

are normally not available on the keyboard. Preparing 
scientific papers in the required format of various scientific 
journals is a very tedious task due to reasons like this. 

So, every mathematician will eventually use a document 
preparation system and the best one is LaTeX. It is widely 
used in academia and the industry, to prepare scientific 
documents in areas like mathematics, statistics, computer 
science, engineering, etc. LaTeX is cross-platform, free 
and open source software licensed under the LaTeX 
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Figure 10: Schrodinger equation with LaTeX 


Project Public License. It can also be used for efficient 
reference management. Figure 10 shows the time-dependent 
Schrédinger equation rendered using LaTeX. 

Before concluding the article, we need to make a 
confession —like all Top Ten lists, this too is coloured by 
personal preferences and prejudices. But we have tried to 
make the list as diverse and useful as possible so that not just 
students and practitioners of mathematics, but professionals in 
computer science, physics, chemistry, etc, can also appreciate 
and use these tools to their advantage. Hi 
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» Using the proxy server via the command line 
’ Often administrators need to use the proxy server via 
the command line. For this purpose, we can use the ‘export’ 
command. Open the ‘.bashrc’ file in your text editor by 
using the following command from your terminal: 


vi ~/.bashrc 


Then add the following four lines at the end of your 
‘.bashre’ file in your home folder: 


export http_proxy="http://username : password@ 

proxyserver: port” 

export https_proxy="https://username : password@ 
proxyserver:port” 

export ftp_proxy="ftp://username : password@proxyserver : port” 
export socks_proxy="socks://username : password@ 

proxyserver: port” 


This will make all your applications use the proxy to 
access the Internet. 
—Nagaraju Dhulipalla, nagarajunice@gmail.com 


» Enable the EPEL repository in CentOS/Red Hat 
Many a time, we need to install additional packages 

on GNU/Linux distributions but installing them from 
unknown repositories is not safe. This is where the EPEL 
(Extra Packages for Enterprise Linux) repository comes to 
the rescue. All EPEL packages are maintained by Fedora 
and can be installed on RHEL, CentOS, Scientific Linux 
and other RPM based distributions. One of the biggest 
advantages is that the repository is 100 per cent open 
source and free to use. Additionally, it does not provide any 
core duplicate packages and has no compatibility issues. So 
let us configure the EPEL repository in CentOS. 

To enable the EPEL repository, we need to download 
and install the RPM package. The following commands can 
be used on CentOS 7.x: 


[root]# wget http://dl.fedoraproject .org/pub/epel/7/ 


we 6 


TRICKS. 


x86_64/e/epel-release-7-10.noarch. rpm 
[root]# rpm -ivh epel-release-7-10.noarch.rpm 


If you are using CentOS 6.x, then the following 
command will do the needful: 


[root]# wget http://download.fedoraproject .org/pub/epel/6/ 
x86_64/epel-release-6-8.noarch.rpm 
[root]# rpm -ivh epel-release-6-8.noarch. rpm 


That’s it. The EPEL repository is now enabled on 
your system. We can verify it by listing the configured 
repositories. 


[root]# yum repolist 
—Narendra Kangralkar, narendrakangralkar@gmail.com 


» Finding syntax errors in .ohp pages 
~ Here is a simple command to help you find syntax 
errors in .php pages: 


find -type f -name “*.php” -exec php -l ‘{}’ \; | grep ‘A[AN]’ 
—Remin Raphael, remin13@gmail.com 


’ How to create dialogue boxes using Zenity 
J and notify-send 

Zenity enables the user to create various types of 
simple dialogue boxes that interact graphically with 
the user. These include notification, message and text 
information dialogue boxes. 

Shown below is an example of a Zenity text information 
dialogue box: 


$zenity --info --text “good morning” 


--text-info 


...displays the text information dialogue with the text 
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mentioned in the command. 
$zenity --calendar 


...displays the calendar dialogue box as shown in Figure 1. 

The notify-send command allows you to send desktop 
notifications to the user via a notification daemon from the 
command line. This is useful to inform the desktop user 
about an event or display some form of information without 
getting in the user’s way. To use this, you need to install the 
following package: 


$ sudo apt-get install libnotify-bin 
An example is given below: 
$notify-send “Welcome to UNIX” 


$notify-send -t 1000 -u normal “Script Running 
Successfully...” 


where, -t 1000 specifies the timeout in milliseconds 
(1000 milliseconds = 1 second) 
-u normal: Indicates the urgency level (i.e., low, 
normal, or critical). 
—Neethu C. Sekhar, nitucskr@gmail.com 


Printing the directory structure on the terminal 
Sometimes we need to print the nested directory 
structure of a project’s source code for documentation 
or tutorial purposes. The tree utility comes to the rescue 
here. You can install it on Debian or Ubuntu using 
the following command: 


$sudo apt-get install tree 


To get the structure of the directory on the terminal, use 
the command given below: 


tree directory_name 
$ tree e2e 
The output will be as shown below: 


e2e 

L— app.e2e.ts 
L— app.po.ts 
L— tsconfig.json 
L— typings.d.ts 


@ directories, 4 files 


—Amar Shukla, amarshukla123@gmail.com 


_ find, xargs and awk tweaks 
find, xargs and awk are some of the powerful tools 
that can be used on a Linux command line. Here are a few 
examples of the same: 


1. The following code is to archive and delete files in a 
folder based on a modified timestamp: 


find /path/to/your/directory/ -maxdepth 1 -type f -mtime 
+35 -mtime -90 -printO | xargs -@ -n 1000 bash -c ‘if [ $# 
-ne @ J; then echo archiving $# files; tar ufP /path/to/ 
archive/archive_file.tar $@ 2>/dev/null; rm -f $@; fi’ bash 


The description of this code is as follows. The find 
command will look for files with a modified timestamp 
older than 35 days and less than 90 days. xargs will 
processs 1000 files in each batch, and the processing 
includes adding the files to the tar archive and removing 
the files from a physical location. 


2. To print all the extensions in a given folder, use the 
following command: 


find /path/to/your/dir/ -type f | awk -F”.” 
“la[$NF]++{print $NF}’ 


The description is as follows. To find all the files in a 
given directory, feed the output to awk, which will split 
the file name with “.” (period) and print the last value, 
which is the extension. 


3. To get the total size of all the files returned from find, 
use the following command: 


find /path/to/your/dir/ -type f 
‘{total+=$1} END{print total}’ 


-exec du {} \; | awk 


The description is as follows. Find all the files in a 
directory, print the size in bytes and feed the output to the 
awk command to print the total size of all files. 


—Sarath Chandra Raja Akurathi, 
sarath.c.akurathi@gmail.com 


Share Your Open Source Recipes! 


The joy of using open source software is in finding ways to get 
around problems—take them head on, defeat them! We invite 
you to share your tips and tricks with us for publication in 
OSFY so that they can reach a wider audience. Your tips could 
be related to administration, programming, troubleshooting or 
general tweaking. Submit them at www.opensourceforu.com. 
The sender of each published tip will get a T-shirt. 
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OSFY DVD 


DVD OF THE MONTH 


Software that will helo you to secure your network. 


Manjaro 17.1. 


Also Included 
IPFire: This is an 
open source firewall 
distribution 


-  OPNsense: This is 
another open 


source firewall 


bien FreespireOS is a 64-bit freely 


most trusted : available and open Linux 
I c O based OS that is geared 
towards open source users 
and developers 


February 2018 


Manjaro is a user friendly Linux distribution 
based on the independently-developed 
Arch operating system 


A live CD/DVD or live disk contains a bootable operating 
system, the core program of any computer, which is 
designed to run all your programs and manage all your 
hardware and software. 

Live CDs/DVDs have the ability to run a complete, 
modern OS on a computer even without secondary 
storage, such as a hard disk drive. The CD/DVD directly 
runs the OS and other applications from the DVD drive 
itself. Thus, a live disk allows you to try the OS before 
you install it, without erasing or installing anything on 
your current system. Such disks are used to demonstrate 
features or try out a release. They are also used for 
testing hardware functionality, before actual installation. 
To run a live DVD, you need to boot your computer 
using the disk in the ROM drive. To learn how to set 
a boot device in BIOS, please refer to the hardware 
documentation for your computer/laptop. 
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Manjaro Linux 17.1.0 GNOME (Live) 

Manjaro is a user friendly Linux distribution based on the 
independently-developed Arch operating system. Within 
the Linux community, Arch is renowned for being an 
exceptionally fast, powerful and lightweight distribution 
that provides access to the very latest cutting-edge — and 
bleeding-edge — software. The latest edition boasts of a 
simplified, user friendly installation process with automatic 
detection of your computer's hardware (e.g., graphics cards). 
It comes with updated versions of GIMP, Firefox, Wine and 
other daily use apps. 


FreespireOS 3 

FreespireOS is a 64-bit freely available and open Linux 
based OS that is geared towards open source users and 
developers. It has all the applications users will need for 
the consumption of media, and developer tools for those 
that want to tinker with the system and deploy their own 
custom software and kernels. It contains a highly advanced 
desktop that is comfortable for users who want to be more 
productive; it is easy to use and requires very little retraining. 
It also includes a whole host of applications and features 
that are geared towards the community user. 


The DVD also has ISO images of different firewalls. 

+ |PFire: This is an open source firewall distribution. 

* OPNsense: This is another open source firewall. 

+ pfSsense: This is supposedly the world's most trusted 
open source network security solution. 

+ ZeroShell: An open source network appliance. 


Install IPF ire Z.19 ~ Core 117 


Asia's #1 website on open source 
in an all-new avatar! 


Just at the click of a button. 


Whether you surf the Open Source For You website from your smartphone, 
tablet, PC or Mac, you will enjoy a unified experience across all devices. 


www.OpenSourceForU.com 


You can also submit your tips, contribute with your ideas or extend your subscription directly from the website. 


Remember to follow us on Twitter (@OpenSourceForU) and like us on Facebook (Facebook.com/OpenSourceForU) to get regular updates on open source developments. 
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is hiring 


Interested? 


Mail Resume + Cover Letter to 
contact@loonycorn.com 


You: 


e Really into tech - cloud, ML, anything and everything 
e Interested in video as a medium 

e Willing to work from Bangalore 

e inthe 0-3 years of experience range 


= ex-Google | Stanford | INSEAD 
= 100,000+ students 
= Video content on Pluralsight, Stack, Udemy... 


